Package 'whep' reference manual

Title:	Processing Agro-Environmental Data
Description:	A set of tools for processing and analyzing data developed in the context of the "Who Has Eaten the Planet" (WHEP) project, funded by the European Research Council (ERC). For more details on multi-regional input–output model "Food and Agriculture Biomass Input–Output" (FABIO) see Bruckner et al. (2019) <doi:10.1021/acs.est.9b03554>.
Authors:	Catalin Covaci [aut, cre] (ORCID: <https://orcid.org/0009-0005-2186-5972>), Eduardo Aguilera [aut, cph] (ORCID: <https://orcid.org/0000-0003-4382-124X>), Alice Beckmann [aut] (ORCID: <https://orcid.org/0009-0009-6840-0258>), Juan Infante [aut] (ORCID: <https://orcid.org/0000-0003-1446-7181>), Justin Morgan [aut] (ORCID: <https://orcid.org/0009-0003-7022-4288>), João Serra [ctb] (ORCID: <https://orcid.org/0000-0002-3561-5350>), European Research Council [fnd]
Maintainer:	Catalin Covaci <[email protected]>
License:	MIT + file LICENSE
Version:	0.3.0.9000
Built:	2026-07-24 18:12:45 UTC
Source:	https://github.com/eduaguilera/whep

Get area codes from area names

Description

Add a new column to an existing tibble with the corresponding code for each name. The codes are assumed to be from those defined by the FABIO model.

Usage

add_area_code(table, name_column = "area_name", code_column = "area_code")
add_area_code(table, name_column = "area_name", code_column = "area_code")

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column in table containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents of table and an extra column named code_column, which contains the codes. If there is no code match, an NA is included.

Examples

table <- tibble::tibble(
  area_name = c("Armenia", "Afghanistan", "Dummy Country", "Albania")
)

add_area_code(table)

table |>
  dplyr::rename(my_area_name = area_name) |>
  add_area_code(name_column = "my_area_name")

add_area_code(table, code_column = "my_custom_code")
table <- tibble::tibble(
  area_name = c("Armenia", "Afghanistan", "Dummy Country", "Albania")
)

add_area_code(table)

table |>
  dplyr::rename(my_area_name = area_name) |>
  add_area_code(name_column = "my_area_name")

add_area_code(table, code_column = "my_custom_code")

Get area names from area codes

Description

Add a new column to an existing tibble with the corresponding name for each code. The codes are assumed to be from those defined by the FABIO model, which them themselves come from FAOSTAT internal codes. Equivalences with ISO 3166-1 numeric can be found in the Area Codes CSV from the zip file that can be downloaded from FAOSTAT. TODO: Think about this, would be nice to use ISO3 codes but won't be enough for our periods.

Usage

add_area_name(table, code_column = "area_code", name_column = "area_name")
add_area_name(table, code_column = "area_code", name_column = "area_name")

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column in table containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents of table and an extra column named name_column, which contains the names. If there is no name match, an NA is included.

Examples

table <- tibble::tibble(area_code = c(1, 2, 4444, 3))

add_area_name(table)

table |>
  dplyr::rename(my_area_code = area_code) |>
  add_area_name(code_column = "my_area_code")

add_area_name(table, name_column = "my_custom_name")
table <- tibble::tibble(area_code = c(1, 2, 4444, 3))

add_area_name(table)

table |>
  dplyr::rename(my_area_code = area_code) |>
  add_area_name(code_column = "my_area_code")

add_area_name(table, name_column = "my_custom_name")

Add a final-demand product-area stage to footprints.

Description

Split footprint rows by the area that supplied the final-demand product, using shares from y_mat. This preserves the standard footprint totals while adding a FABIO-viewer style phase: origin product -> product/supplier area -> product -> final-demand area.

This is a compact global-view helper. It does not recompute the full origin-sector by product-sector Leontief cube. Instead, each existing footprint row is allocated over the product-area shares observed in final demand for the same target_area, target_fd, and target_item.

Usage

add_footprint_product_stage(
  footprints,
  y_mat,
  labels,
  fd_labels,
  max_product_areas = 5,
  other_area_name = "Other",
  min_share = 0
)
add_footprint_product_stage(
  footprints,
  y_mat,
  labels,
  fd_labels,
  max_product_areas = 5,
  other_area_name = "Other",
  min_share = 0
)

Arguments

footprints

Footprint table from compute_footprint() with target_area, target_item, target_fd, and value.

y_mat

Final demand matrix from build_io_model().

labels

Tibble mapping Y rows to area_code and item_cbs_code.

fd_labels

Tibble mapping Y columns to area_code and fd_col.

max_product_areas

Maximum number of supplier/product areas to keep separately for each final-demand area, item, and demand category. Smaller supplier areas are grouped into other_area_name.

other_area_name

Label for grouped supplier/product areas.

min_share

Drop split paths smaller than this percentage of the total input footprint value. Use 0 to keep all split paths.

Value

footprints with product_area, product_area_name, product_item, and product_share columns. value is replaced by the split path value.

Get commodity balance sheet item codes from item names

Description

Add a new column to an existing tibble with the corresponding code for each commodity balance sheet item name. The codes are assumed to be from those defined by FAOSTAT.

Usage

add_item_cbs_code(
  table,
  name_column = "item_cbs_name",
  code_column = "item_cbs_code"
)
add_item_cbs_code(
  table,
  name_column = "item_cbs_name",
  code_column = "item_cbs_code"
)

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column in table containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents of table and an extra column named code_column, which contains the codes. If there is no code match, an NA is included.

Examples

table <- tibble::tibble(
  item_cbs_name = c("Cottonseed", "Eggs", "Dummy Item")
)
add_item_cbs_code(table)

table |>
  dplyr::rename(my_item_cbs_name = item_cbs_name) |>
  add_item_cbs_code(name_column = "my_item_cbs_name")

add_item_cbs_code(table, code_column = "my_custom_code")
table <- tibble::tibble(
  item_cbs_name = c("Cottonseed", "Eggs", "Dummy Item")
)
add_item_cbs_code(table)

table |>
  dplyr::rename(my_item_cbs_name = item_cbs_name) |>
  add_item_cbs_code(name_column = "my_item_cbs_name")

add_item_cbs_code(table, code_column = "my_custom_code")

Get commodity balance sheet item names from item codes

Description

Add a new column to an existing tibble with the corresponding name for each commodity balance sheet item code. The codes are assumed to be from those defined by FAOSTAT.

Usage

add_item_cbs_name(
  table,
  code_column = "item_cbs_code",
  name_column = "item_cbs_name"
)
add_item_cbs_name(
  table,
  code_column = "item_cbs_code",
  name_column = "item_cbs_name"
)

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column in table containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents of table and an extra column named name_column, which contains the names. If there is no name match, an NA is included.

Examples

table <- tibble::tibble(item_cbs_code = c(2559, 2744, 9876))
add_item_cbs_name(table)

table |>
  dplyr::rename(my_item_cbs_code = item_cbs_code) |>
  add_item_cbs_name(code_column = "my_item_cbs_code")

add_item_cbs_name(table, name_column = "my_custom_name")
table <- tibble::tibble(item_cbs_code = c(2559, 2744, 9876))
add_item_cbs_name(table)

table |>
  dplyr::rename(my_item_cbs_code = item_cbs_code) |>
  add_item_cbs_name(code_column = "my_item_cbs_code")

add_item_cbs_name(table, name_column = "my_custom_name")

Get production item codes from item names

Description

Add a new column to an existing tibble with the corresponding code for each production item name. The codes are assumed to be from those defined by FAOSTAT.

Usage

add_item_prod_code(
  table,
  name_column = "item_prod_name",
  code_column = "item_prod_code"
)
add_item_prod_code(
  table,
  name_column = "item_prod_name",
  code_column = "item_prod_code"
)

Arguments

table

The table that will be modified with a new column.

name_column

The name of the column in table containing the names.

code_column

The name of the output column containing the codes.

Value

A tibble with all the contents of table and an extra column named code_column, which contains the codes. If there is no code match, an NA is included.

Examples

table <- tibble::tibble(
  item_prod_name = c("Rice", "Cabbages", "Dummy Item")
)
add_item_prod_code(table)

table |>
  dplyr::rename(my_item_prod_name = item_prod_name) |>
  add_item_prod_code(name_column = "my_item_prod_name")

add_item_prod_code(table, code_column = "my_custom_code")
table <- tibble::tibble(
  item_prod_name = c("Rice", "Cabbages", "Dummy Item")
)
add_item_prod_code(table)

table |>
  dplyr::rename(my_item_prod_name = item_prod_name) |>
  add_item_prod_code(name_column = "my_item_prod_name")

add_item_prod_code(table, code_column = "my_custom_code")

Get production item names from item codes

Description

Add a new column to an existing tibble with the corresponding name for each production item code. The codes are assumed to be from those defined by FAOSTAT.

Usage

add_item_prod_name(
  table,
  code_column = "item_prod_code",
  name_column = "item_prod_name"
)
add_item_prod_name(
  table,
  code_column = "item_prod_code",
  name_column = "item_prod_name"
)

Arguments

table

The table that will be modified with a new column.

code_column

The name of the column in table containing the codes.

name_column

The name of the output column containing the names.

Value

A tibble with all the contents of table and an extra column named name_column, which contains the names. If there is no name match, an NA is included.

Examples

table <- tibble::tibble(item_prod_code = c(27, 358, 12345))
add_item_prod_name(table)

table |>
  dplyr::rename(my_item_prod_code = item_prod_code) |>
  add_item_prod_name(code_column = "my_item_prod_code")

add_item_prod_name(table, name_column = "my_custom_name")
table <- tibble::tibble(item_prod_code = c(27, 358, 12345))
add_item_prod_name(table)

table |>
  dplyr::rename(my_item_prod_code = item_prod_code) |>
  add_item_prod_name(code_column = "my_item_prod_code")

add_item_prod_name(table, name_column = "my_custom_name")

Add WHEP polity codes to a table

Description

Adds periodized polity_code information from polity_area_crosswalk to a table with FAOSTAT/FABIO area_code values. If a year column is present, the mapping is year-aware; otherwise the current/default mapping is used.

Usage

add_polity_code(
  table,
  code_column = "area_code",
  year_column = "year",
  polity_code_column = "polity_code",
  backcast_anchor = 1961L
)
add_polity_code(
  table,
  code_column = "area_code",
  year_column = "year",
  polity_code_column = "polity_code",
  backcast_anchor = 1961L
)

Arguments

table

A data frame.

code_column

Name of the column containing numeric area codes.

year_column

Name of the column containing years. Set to NULL to force current/default mapping.

polity_code_column

Name of the output polity-code column.

backcast_anchor

First year of reported (non-back-cast) FAOSTAT data, default 1961. Years before it are matched to the polity active in the anchor year, because WHEP's pre-anchor series are back-cast onto the anchor-year territory rather than reported under their data-year borders. Set to -Inf to disable and match strictly by data year.

Value

A tibble with added polity metadata columns.

Aggregate gridded grass availability to polity totals.

Description

Sums gridded grass availability to polity (country, or subnational where available) totals, splitting each cell's grass by the cell's land-area share in each polity so border cells are attributed proportionally. The polity grass supply ceiling for feed allocation.

Usage

aggregate_grass_to_polity(grass, cell_polity)
aggregate_grass_to_polity(grass, cell_polity)

Arguments

grass

Gridded grass availability from build_grass_availability(), with lon, lat, year and grass_avail_dm_t.

cell_polity

Cell-to-polity mapping with lon, lat, area_code and polity_frac (the cell's land-area fraction in the polity; pass 1 for a majority assignment, e.g. from country_grid).

Value

A tibble with area_code, year and grass_avail_dm_t.

Examples

grass <- build_grass_availability(method = "lpjml", example = TRUE)
cp <- tibble::tibble(
  lon = grass$lon,
  lat = grass$lat,
  area_code = 1L,
  polity_frac = 1
)
aggregate_grass_to_polity(grass, cp)
grass <- build_grass_availability(method = "lpjml", example = TRUE)
cp <- tibble::tibble(
  lon = grass$lon,
  lat = grass$lat,
  area_code = 1L,
  polity_frac = 1
)
aggregate_grass_to_polity(grass, cp)

Align an extension table to input-output sector labels.

Description

Turn a long-format extension table into the dense per-sector numeric vector that compute_footprint() expects, ordered to match a single year's labels from build_io_model(). Sectors absent from the extension are filled with zero, and extension rows outside the model are dropped. Rows sharing an ⁠(area_code, item_cbs_code)⁠ are summed.

Usage

align_extension(extension, labels, year, value_col = "impact_u")
align_extension(extension, labels, year, value_col = "impact_u")

Arguments

extension

Long-format extension tibble with year, area_code, item_cbs_code and the column named by value_col.

labels

One year's labels tibble from build_io_model(), with area_code, item_cbs_code and index columns.

year

Year to select from extension.

value_col

Name of the extension magnitude column, "impact_u" by default.

Value

A numeric vector with one entry per row of labels, ordered by the label index.

Examples

extension <- tibble::tibble(
  year = 2000L,
  area_code = 1L,
  item_cbs_code = 10L,
  impact_u = 5
)
labels <- tibble::tibble(
  index = 1:2,
  area_code = c(1L, 1L),
  item_cbs_code = c(10L, 20L)
)
align_extension(extension, labels, 2000L)
extension <- tibble::tibble(
  year = 2000L,
  area_code = 1L,
  item_cbs_code = 10L,
  impact_u = 5
)
labels <- tibble::tibble(
  index = 1:2,
  area_code = c(1L, 1L),
  item_cbs_code = c(10L, 20L)
)
align_extension(extension, labels, 2000L)

Allocate grazing land forward to livestock products.

Description

Push grazing land forward along the feed chain instead of tracing it backward through a Leontief inverse. Grass is not traded, so the multi-regional input-output footprint leaks the grazing land that meat and dairy consumption actually drives (the feed-conversion columns make the system non-productive). This function hands every tonne of grazing land forward in two mass-conserving steps:

pool each country's grazing land and split it across the grazing animals that fed there, in proportion to their forage intake;
split each animal's share across its output products in proportion to production mass, landing the land on the meat and milk items that are actually traded.

The result is a direct-land extension keyed by livestock output item, ready to route to consumers with compute_footprint_balance(). Because each step redistributes a total without creating or destroying it, the country-level land total is conserved whenever an animal fed and produced an output there.

Usage

allocate_grazing_to_products(
  grass_land,
  grazer_intake,
  livestock_production,
  products = c("all", "meat_milk")
)
allocate_grazing_to_products(
  grass_land,
  grazer_intake,
  livestock_production,
  products = c("all", "meat_milk")
)

Arguments

grass_land

Tibble with year, area_code and value (grazing land, e.g. hectares). Multiple rows per country (for example one per grass item) are pooled.

grazer_intake

Tibble with year, area_code, live_anim_code and value (the intake by which to split the land across animals), as from get_feed_intake(). Using grazer forage intake (grass plus roughage residues) rather than the "grass" feed type alone keeps extensive-grazing countries, whose forage the feed model classes as residues, from dropping out.

livestock_production

Tibble with year, area_code, live_anim_code, item_cbs_code and value (output production, tonnes).

products

Which output items absorb the land. "all" (default) spreads it across every livestock output by mass; "meat_milk" restricts it to meat and dairy items, so all grazing land lands on meat and dairy consumers per the forward-allocation goal.

Value

A tibble with year, area_code, item_cbs_code (a livestock output item), value (grazing land allocated to it) and method_allocation (the chosen products).

Examples

grass_land <- tibble::tibble(
  year = 2010L, area_code = 1L, value = 100
)
grazer_intake <- tibble::tibble(
  year = 2010L,
  area_code = 1L,
  live_anim_code = c(960L, 976L),
  value = c(75, 25)
)
livestock_production <- tibble::tibble(
  year = 2010L,
  area_code = 1L,
  live_anim_code = c(960L, 960L, 976L),
  item_cbs_code = c(2848L, 2731L, 2732L),
  value = c(90, 10, 5)
)
allocate_grazing_to_products(
  grass_land, grazer_intake, livestock_production
)
grass_land <- tibble::tibble(
  year = 2010L, area_code = 1L, value = 100
)
grazer_intake <- tibble::tibble(
  year = 2010L,
  area_code = 1L,
  live_anim_code = c(960L, 976L),
  value = c(75, 25)
)
livestock_production <- tibble::tibble(
  year = 2010L,
  area_code = 1L,
  live_anim_code = c(960L, 960L, 976L),
  item_cbs_code = c(2848L, 2731L, 2732L),
  value = c(90, 10, 5)
)
allocate_grazing_to_products(
  grass_land, grazer_intake, livestock_production
)

Allocate field-available manure to cropland and grassland by crop.

Description

Distributes the manure nitrogen, carbon and volatile solids that survive management (the output of apply_management_losses()) across cropland crops and grassland, with a local agronomic cap and an explicit overflow path. The collected/housed manure fills cropland up to each crop's cap (weighted by West-2014 receptivity by default), the surplus spills onto grassland up to the grassland cap, and any residual beyond all reachable sinks follows the disposal_method. The in-situ grazing stream is deposited on grassland where it falls, uncapped. Carbon and volatile solids ride along with each stream at its post-storage bundle ratio, so mass is conserved by construction.

Usage

allocate_manure_to_land(applied, gridded = list(), options = list())
allocate_manure_to_land(applied, gridded = list(), options = list())

Arguments

applied

A tibble from apply_management_losses() with at least year, territory, sub_territory, stream, applied_n, applied_c and applied_vs.

gridded

A named list describing the land surface for each polity:

crops: a tibble keyed by year, territory, sub_territory, crop with the allocation weight (manure_n_receptivity for "area_x_receptivity", crop_n_demand for "crop_n_demand") and the cap basis (crop_n_cap, t N, for "potential_uptake"/"realised_removal"; crop_area_ha for "fixed_ceiling").
grass (optional): a tibble keyed by year, territory, sub_territory with grass_n_cap (t N) or grass_area_ha. The grassland cap is scaled by f_n_tolerance on the same footing as the uptake-based crop caps (not for "fixed_ceiling"). Absent grassland means no grassland sink (cap zero).

options

A named list of method options: method ("area_x_receptivity" default, or "crop_n_demand"), cap_method ("potential_uptake" default, "realised_removal", or "fixed_ceiling"), f_n_tolerance (default 1.2, applied to the uptake-based caps), fixed_ceiling_kg_ha (default 170, EU Nitrates) and disposal_method ("over_apply_local" default, "unmanaged_disposal", or "retain_unallocated").

Value

A tibble with one row per allocation target: year, territory, sub_territory, land_use ("Cropland"/"Grassland"/"Disposal"/ "Unallocated"), crop, source_stream ("collected"/"grazing"), applied_n, applied_c, applied_vs, over_cap and the method_allocation, method_cap and disposal_method provenance columns.

Examples

applied <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~stream,
  ~applied_n, ~applied_c, ~applied_vs,
  2020L, "ESP", NA, "collected", 80, 800, 40,
  2020L, "ESP", NA, "grazing", 20, 380, 12
)
crops <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~crop, ~manure_n_receptivity, ~crop_n_cap,
  2020L, "ESP", NA, "barley", 6, 50,
  2020L, "ESP", NA, "wheat", 4, 40
)
allocate_manure_to_land(applied, list(crops = crops))
applied <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~stream,
  ~applied_n, ~applied_c, ~applied_vs,
  2020L, "ESP", NA, "collected", 80, 800, 40,
  2020L, "ESP", NA, "grazing", 20, 380, 12
)
crops <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~crop, ~manure_n_receptivity, ~crop_n_cap,
  2020L, "ESP", NA, "barley", 6, 50,
  2020L, "ESP", NA, "wheat", 4, 40
)
allocate_manure_to_land(applied, list(crops = crops))

Spill surplus manure to neighbouring cells with spare capacity.

Description

Moves the manure nitrogen that a cell could not place locally (its surplus above the local cropland and grassland caps) to nearby same-polity cells that still have room, on the 0.5-degree grid. Each source offers its surplus to its single-ring (king-move) neighbours in proportion to their remaining room; a sink that is over-subscribed by several sources is filled only to its room and the rejected manure returns to the sources as residual. This is the resolution-robust, mass-conserving analogue of Spain's room-weighted first-ring redistribution (Calc_OA_redistribution); cross-polity transport is not allowed. Carbon and volatile solids ride along with each nitrogen flow at the source cell's bundle ratio.

Usage

allocate_manure_transport(source_cells, sink_cells, options = list())
allocate_manure_transport(source_cells, sink_cells, options = list())

Arguments

source_cells

A tibble of cells exporting manure, keyed by year, territory and sub_territory (a "lon_lat" cell id), with surplus_n, surplus_c and surplus_vs (t).

sink_cells

A tibble of cells with spare capacity, keyed by year, territory and sub_territory, with room_n (remaining cropland plus grassland N capacity, t).

options

A named list. n_rings (default 1) is the neighbourhood radius in grid steps; only the single-ring kernel is shipped.

Value

A tibble with one row per landing site: year, territory, sub_territory (the sink cell for transported manure, the source cell for residual), applied_n, applied_c, applied_vs, kind ("transported" or "residual") and method_transport. The "residual" rows are the un-transportable surplus handed back for local disposal.

Examples

source_cells <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~surplus_n, ~surplus_c, ~surplus_vs,
  2020L, "ESP", "1.5_40", 10, 90, 6
)
sink_cells <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~room_n,
  2020L, "ESP", "1_40", 4,
  2020L, "ESP", "2_40", 6
)
allocate_manure_transport(source_cells, sink_cells)
source_cells <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~surplus_n, ~surplus_c, ~surplus_vs,
  2020L, "ESP", "1.5_40", 10, 90, 6
)
sink_cells <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~room_n,
  2020L, "ESP", "1_40", 4,
  2020L, "ESP", "2_40", 6
)
allocate_manure_transport(source_cells, sink_cells)

Animal codes and classifications

Description

Maps live animal CBS items to their livestock classifications, process codes, and associated product items used in livestock modeling.

Usage

animals_codes
animals_codes

Format

A tibble where each row corresponds to one live animal CBS item. It contains the following columns:

Item_Code: Numeric FAOSTAT item code for the live animal.
item_cbs: Name of the CBS item (e.g., "Cattle", "Asses").
proc_code: Short process code used internally (e.g., "p092").
proc: Descriptive process name (e.g., "Asses").
item_cbs_code: Numeric CBS item code (often equal to Item_Code).
Farm_class: Broad farm classification grouping the animal. One of "Cattle", "Dairy_cows", "Monogastric", "Sheep_goats", "Bees", "Game".
Item_product: Name of the primary product derived from this animal, if applicable (e.g., milk for dairy cows).
Item_Code_product: Numeric FAOSTAT code for the associated product item.
Liv_prod_cat: Livestock product category the animal belongs to.
Graniv_grazers: Broad feeding behaviour classification. One of "Grazers", "Granivores", "Bees", "Game".
Livestock_name: Internal livestock identifier used across datasets (e.g., "Cattle", "Dairy_cows", "Asses").
Animal_class: Fine-grained animal class, including production type distinctions (e.g., "Broilers", "Hens", "Hogs", "Dairy_cows").
Item_FAOmanure: Name of the corresponding FAOSTAT manure management item.
Item_Code_FAOmanure: Numeric code of the FAOSTAT manure management item.
Cat_Labour: Labour category used in labour-related analyses. One of "Cattle", "Equines", "Dairy_cows", "Birds", "Small_ruminants", "Pigs", "Bees".
Cat_FAO1: Top-level FAO category. Currently always "Animal".
item_bouwman: Item name used in Bouwman et al. livestock datasets.

Source

Derived from FAOSTAT data and internal livestock classification work.

Examples

head(animals_codes)
head(animals_codes)

Apply IPCC manure-management losses to the collected manure streams.

Description

Nets the nitrogen surviving manure management onto the field, applying the IPCC 2019 management-loss fractions to the collected/housed streams from split_manure_management(): applied_n = n_stream * (1 - FracLossMS) where FracLossMS = FracGasMS + FracLeachMS + EF3 + FracN2MS. The grazing (pasture/range/paddock) stream is deposited in situ and keeps its full nitrogen (its in-situ soil losses belong to the soil stage). Indirect N2O is reported as a labelled sub-flux of the already-removed volatilized and leached nitrogen (the same N is not removed twice). Carbon applied to the field is applied_n times the post-storage manure C:N (the solid/liquid/excreta value for the stream's management system), so the applied C:N reflects storage, not fresh excreta; the carbon and volatile-solids storage losses follow from that.

Usage

apply_management_losses(split, options = list())
apply_management_losses(split, options = list())

Arguments

split

A tibble from split_manure_management().

options

A named list. method selects the loss method ("ipcc_2019_tier2").

Value

The input rows with manure_type, applied_n, applied_c, applied_vs, n_volatilized, n_leached, n2o_direct_n, n2_n, n2o_indirect_n, c_lost, vs_destroyed and method_losses.

Examples

excretion <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~n_excretion, ~c_excretion, ~vs_excretion,
  2020L, "ES", NA, "Cattle_milk", 100, 1900, 60
)
apply_management_losses(split_manure_management(excretion))
excretion <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~n_excretion, ~c_excretion, ~vs_excretion,
  2020L, "ES", NA, "Cattle_milk", 100, 1900, 60
)
apply_management_losses(split_manure_management(excretion))

Assert that footprint conservation invariants hold.

Description

Build-time gate over a footprint result, intended for use in a pipeline or regression test. Aborts when the footprint over-traces any origin sector (embodied pressure exceeds the source, which should never happen) or when the global share of untraced pressure exceeds max_rel_loss. A regression that silently loses or inflates pressure then fails loudly instead of shipping.

Usage

assert_footprint_invariants(
  footprint,
  extensions,
  labels,
  x_vec,
  max_rel_loss = 0.05
)
assert_footprint_invariants(
  footprint,
  extensions,
  labels,
  x_vec,
  max_rel_loss = 0.05
)

Arguments

footprint

Footprint tibble from compute_footprint(), with origin_area, origin_item and value columns.

extensions

Numeric vector of environmental extensions per sector, as passed to compute_footprint().

labels

Tibble with area_code and item_cbs_code mapping each sector to its meaning, as passed to compute_footprint().

x_vec

Numeric vector of total output per sector.

max_rel_loss

Maximum tolerated global relative under-tracing (pressure that never reaches final demand). The engine's negative-zeroing and column capping cause a small amount, so this is non-zero by default.

Value

Invisibly, the one-row summary from summarise_conservation(). Called for its side effect of aborting on violation.

Examples

z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
fp <- compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
)
assert_footprint_invariants(fp, extensions, labels, x_vec)
z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
fp <- compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
)
assert_footprint_invariants(fp, extensions, labels, x_vec)

Attach a provenance record to a result.

Description

Store a provenance record (from record_provenance()) on an object as an attribute, so the result travels together with its lineage. Retrieve it later with get_provenance().

Usage

attach_provenance(x, provenance)
attach_provenance(x, provenance)

Arguments

x

Any R object, typically a result tibble.

provenance

Provenance tibble from record_provenance().

Value

x, unchanged except for an added whep_provenance attribute.

Examples

prov <- record_provenance(
  aliases = "bilateral_trade",
  recorded_at = as.POSIXct("2026-01-01", tz = "UTC")
)
result <- attach_provenance(tibble::tibble(value = 1), prov)
get_provenance(result)
prov <- record_provenance(
  aliases = "bilateral_trade",
  recorded_at = as.POSIXct("2026-01-01", tz = "UTC")
)
result <- attach_provenance(tibble::tibble(value = 1), prov)
get_provenance(result)

Attach a scope record to a result.

Description

Store a scope record (from footprint_scope()) on an object as an attribute, so the result carries its goal-and-scope with it.

Usage

attach_scope(x, scope)
attach_scope(x, scope)

Arguments

x

Any R object, typically a footprint tibble.

scope

Scope tibble from footprint_scope().

Value

x, unchanged except for an added whep_scope attribute.

Examples

scope <- footprint_scope("cropland", "ha", "FABIO-MRIO")
result <- attach_scope(tibble::tibble(value = 1), scope)
get_scope(result)
scope <- footprint_scope("cropland", "ha", "FABIO-MRIO")
result <- attach_scope(tibble::tibble(value = 1), scope)
get_scope(result)

Attribute reported fallow land to crops.

Description

Distribute each country's FAOSTAT-reported fallow area among its crops using a precomputed allocation weight, adding the result to each crop's cropped physical area. The weight is typically gridded_fallow_weights(), which puts fallow on rainfed dryland cereals and rainfed monsoon rice and keeps it off irrigated/continuous systems.

The fallow magnitude comes from fallow_total (FAOSTAT "Land with temporary fallow", item 6640) — reported separately from temporary meadows/pastures, so it isolates real fallow from fodder.

Usage

attribute_fallow_to_crops(cropgrids, fallow_total, alloc_weight)
attribute_fallow_to_crops(cropgrids, fallow_total, alloc_weight)

Arguments

cropgrids

Tibble of national crop areas with columns area_code, item_cbs_code, physical_ha (cropped physical area), harvested_ha.

fallow_total

Tibble of reported fallow area with columns area_code and fallow_ha.

alloc_weight

Tibble of area_code, item_cbs_code, weight giving the within-country allocation weight, e.g. from gridded_fallow_weights().

Value

A tibble with area_code, item_cbs_code, physical_ha (cropped physical area plus attributed fallow), and harvested_ha.

Examples

cropgrids <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~physical_ha, ~harvested_ha,
  1L, 2511L, 500, 500,
  1L, 2807L, 400, 400
)
fallow_total <- tibble::tribble(~area_code, ~fallow_ha, 1L, 200)
# all weight on wheat -> the 200 ha reported fallow goes to wheat
alloc_weight <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~weight,
  1L, 2511L, 1,
  1L, 2807L, 0
)
attribute_fallow_to_crops(cropgrids, fallow_total, alloc_weight)
cropgrids <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~physical_ha, ~harvested_ha,
  1L, 2511L, 500, 500,
  1L, 2807L, 400, 400
)
fallow_total <- tibble::tribble(~area_code, ~fallow_ha, 1L, 200)
# all weight on wheat -> the 200 ha reported fallow goes to wheat
alloc_weight <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~weight,
  1L, 2511L, 1,
  1L, 2807L, 0
)
attribute_fallow_to_crops(cropgrids, fallow_total, alloc_weight)

Balance input-output flows so the footprint conserves.

Description

RAS-balance an inter-industry flow matrix toward the product-balance margin. The row target is intermediate use, $u = X - \text{rowSums}(Y)$ ; the column target keeps the original column composition, rescaled so its total matches $\sum u$ .

When the balanced system is productive (the spectral radius of $A$ is below 1), $(I - A) X = Y$ holds and the footprint conserves – the case for the productive systems in the examples and tests.

Caveat for physical agriculture: livestock feed conversion makes some columns of $A$ sum well above 1 (many tonnes of feed per tonne of product), so the balanced system is not productive. Balancing to faithful margins does not remove that, and the traced footprint then over-traces rather than conserving (on the real 2010 model, by about 15%). RAS therefore does not, on its own, fix the grassland under-tracing; always verify with check_footprint_conservation().

Pass the result as z_mat to compute_footprint() with a large max_column_sum (e.g. 1e12) and conserve_extensions = FALSE.

Usage

balance_io_flows(z_mat, x_vec, y_mat, max_iter = 1000L, tol = 1e-06)
balance_io_flows(z_mat, x_vec, y_mat, max_iter = 1000L, tol = 1e-06)

Arguments

z_mat

Inter-industry flow matrix (dense or Matrix).

x_vec

Total output per sector.

y_mat

Final demand vector, or matrix whose row sums give total final demand per product.

max_iter, tol

Passed to balance_ras().

Value

The balanced flow matrix.

Examples

z_mat <- matrix(c(1, 4, 2, 3), nrow = 2)
x_vec <- c(10, 10)
y_mat <- c(3, 3)
balance_io_flows(z_mat, x_vec, y_mat)
z_mat <- matrix(c(1, 4, 2, 3), nrow = 2)
x_vec <- c(10, 10)
y_mat <- c(3, 3)
balance_io_flows(z_mat, x_vec, y_mat)

Balance a matrix to target margins by RAS.

Description

Reconcile a non-negative matrix to prescribed row and column sums using the RAS (biproportional / iterative proportional fitting) algorithm: alternately rescale rows then columns until both margins are matched, preserving the matrix's structure (zero pattern and relative magnitudes). The row and column targets must have equal totals.

Works on dense and sparse (Matrix) inputs, staying in the input representation so it is efficient for both small dense tables and large sparse input-output systems. (The same core also balances the bilateral trade matrices.) For matrices with negative entries the signed GRAS variant is needed and is not implemented here.

Usage

balance_ras(x, target_rows, target_cols, max_iter = 1000L, tol = 1e-09)
balance_ras(x, target_rows, target_cols, max_iter = 1000L, tol = 1e-09)

Arguments

x

Non-negative matrix (dense or Matrix).

target_rows

Desired row sums, length nrow(x).

target_cols

Desired column sums, length ncol(x).

max_iter

Maximum RAS iterations.

tol

Convergence tolerance on the largest margin deviation, relative to the margin total (so it is scale-free).

Value

The balanced matrix, in the same representation as x (dense in, dense out; sparse in, sparse out).

Examples

m <- matrix(c(1, 2, 3, 4), nrow = 2)
balance_ras(m, target_rows = c(10, 20), target_cols = c(12, 18))
m <- matrix(c(1, 2, 3, 4), nrow = 2)
balance_ras(m, target_rows = c(10, 20), target_cols = c(12, 18))

Biomass coefficients for crops and livestock products

Description

Provides dry-matter, nutrient, and energy conversion coefficients for agricultural products and residues. Used to convert fresh-matter production quantities into biomass flows, nutrient budgets, and energy content.

Usage

biomass_coefs
biomass_coefs

Format

A tibble where each row corresponds to one product or item. It contains 68 columns:

Code: Item code (character), corresponding to FAOSTAT production codes.
Name_biomass: Item name as used in biomass accounting.
Equiv: Reference equivalence item used when coefficients are borrowed from another similar commodity (e.g., "Wheat" for oats).
Category: Broad commodity category (e.g., "Cereals, other", "Barley", "Vegetables").
BG_Biomass_kgDM_ha: Below-ground biomass in kg dry matter per hectare.
Root_Shoot_ratio: Ratio of root to aerial biomass (dimensionless).
Product_kgDM_kgFM: Product dry-matter content in kg DM per kg fresh matter.
Residue_kgDM_kgFM: Residue dry-matter content in kg DM per kg fresh matter of product.
Conventional_kgDM_ha: Conventional yield in kg dry matter per hectare.
Organic_kgDM_ha: Organic yield in kg dry matter per hectare.
GE_product_edible_portion_MJ_kgFM: Gross energy of the edible portion in MJ per kg fresh matter.
GE_product_residue_MJ_kgFM: Gross energy of the residue in MJ per kg fresh matter (may be character due to source formatting).
GE_product_MJ_kgFM: Gross energy of the whole product in MJ per kg fresh matter.
GE_residue_MJ_kg: Gross energy of the residue in MJ per kg.
kg_product_kg_aerial_biomass: Fraction of aerial biomass that is product (harvest index, kg/kg).
kg_residue_kg_aerial_biomass_FM: Fraction of aerial biomass that is residue, on fresh matter basis.
kg_residue_kg_product_FM: Ratio of residue to product on fresh matter basis.
Carcass_to_LW: Carcass-to-live-weight ratio (livestock only; logical placeholder for crop items).
Edible_portion: Edible fraction of the product (kg edible / kg fresh matter).
N_kgN_kgFM: Nitrogen content in kg N per kg fresh matter.
Lipids_g_kgFM: Lipid content in g per kg fresh matter.
Carbohydrates_g_kgFM: Carbohydrate content in g per kg fresh matter.
Calcium_mg_kgFM: Calcium content in mg per kg fresh matter.
VitaminA_microg_kgFM: Vitamin A content in micrograms per kg fresh matter.
Edible_kgDM_kgFM: Edible dry matter in kg per kg fresh matter.
Edible_kgC_kgFM: Edible carbon in kg C per kg fresh matter.
Edible_N_kgFM: Edible nitrogen in kg N per kg fresh matter.
Edible_kgP_kgFM: Edible phosphorus in kg P per kg fresh matter.
Edible_K_kgFM: Edible potassium in kg K per kg fresh matter.
NonEdible_kgDM_kgFM: Non-edible dry matter in kg per kg fresh matter.
NonEdible_kgC_kgFM: Non-edible carbon in kg C per kg fresh matter.
NonEdible_kgN_kgFM: Non-edible nitrogen in kg N per kg fresh matter.
NonEdible_kgP_kgFM: Non-edible phosphorus in kg P per kg fresh matter.
NonEdible_kgK_kgFM: Non-edible potassium in kg K per kg fresh matter.
Product_kgN_kgDM: Nitrogen content of product in kg N per kg dry matter.
Product_kgP_kgDM: Phosphorus content of product in kg P per kg dry matter.
Product_kgK_kgDM: Potassium content of product in kg K per kg dry matter.
Product_kgC_kgDM: Carbon content of product in kg C per kg dry matter.
Residue_kgN_kgDM: Nitrogen content of residue in kg N per kg dry matter.
Residue_kgP_kgDM: Phosphorus content of residue in kg P per kg dry matter.
Residue_kgK_kgDM: Potassium content of residue in kg K per kg dry matter.
Residue_kgC_kgDM: Carbon content of residue in kg C per kg dry matter.
Residue_humified_kgC_kgC: Humification coefficient of residue carbon (fraction of residue C stabilised as soil organic matter).
MgDM_m3: Megagrams dry matter per cubic metre (bulk density proxy).
Root_kgC_kgDM: Root plus rhizodeposit carbon in kg C per kg root dry matter.
Root_humified_kgC_kgC: Humification coefficient for root carbon.
Root_mass_kgC_kgDM: Root carbon mass in kg C per kg crop dry matter.
Rhizodeposits_mass_kgC_kgDM: Rhizodeposit carbon in kg C per kg crop dry matter.
Residue_C_N: Carbon-to-nitrogen ratio of the residue.
Root_kgN_kgDM: Nitrogen content of roots in kg N per kg root dry matter.
GE_Roots_MJ_kgDM: Gross energy of roots in MJ per kg dry matter.
Rhizodeposits_N_kgN_kgRootN: Rhizodeposit nitrogen as a fraction of root nitrogen.
Fiber_g_kgFM: Dietary fibre content in g per kg fresh matter.
SFA_g_kgFM: Saturated fatty acid content in g per kg fresh matter.
MUFA_g_kgFM: Monounsaturated fatty acid content in g per kg fresh matter.
PUFA_g_kgFM: Polyunsaturated fatty acid content in g per kg fresh matter.
PUFA_n3_g_kgFM: Omega-3 PUFA content in g per kg fresh matter.
Iron_mg_kgFM: Iron content in mg per kg fresh matter.
Zinc_mg_kgFM: Zinc content in mg per kg fresh matter.
Magnesium_mg_kgFM: Magnesium content in mg per kg fresh matter.
Cadmium_microg_kgFM: Cadmium content in micrograms per kg fresh matter.
VitaminB12_microg_kgFM: Vitamin B12 content in micrograms per kg fresh matter.
VitaminD_microg_kgFM: Vitamin D content in micrograms per kg fresh matter.
Folate_microg_kgFM: Folate content in micrograms per kg fresh matter.
VitaminC_mg_kgFM: Vitamin C content in mg per kg fresh matter.
VitaminE_mg_kgFM: Vitamin E content in mg per kg fresh matter.
Flavonoids_mg_kgFM: Flavonoid content in mg per kg fresh matter.
Carotenoids_mg_kgFM: Carotenoid content in mg per kg fresh matter.

Source

Compiled from multiple sources including FAO food composition data, crop physiology literature, and IPCC Tier 1 coefficients.

Examples

head(biomass_coefs)
head(biomass_coefs)

Build CBS item prices

Description

Compute prices for all commodity balance sheet items, including processed products and crop residues. Prices are derived from trade data, with special handling for items without direct trade prices (palm kernels, soy hulls, brans, etc.). Crop residue prices are estimated as a fraction of the product price.

Usage

build_cbs_prices(
  cbs,
  trade_prices = NULL,
  residue_price_factor = 0.1,
  example = FALSE
)
build_cbs_prices(
  cbs,
  trade_prices = NULL,
  residue_price_factor = 0.1,
  example = FALSE
)

Arguments

cbs

A tibble of commodity balance sheets, as returned by build_commodity_balances() or get_wide_cbs().

trade_prices

A tibble as returned by build_trade_prices(). If NULL, it is computed internally.

residue_price_factor

Numeric. Relative price of crop residues compared to the product. Default 0.1.

example

Logical. If TRUE, return a small example tibble. Default FALSE.

Value

A tibble with columns:

year: Integer year.
element: "import" or "export".
item_cbs_code: Numeric CBS item code.
price: Price in KDollars per tonne.

Examples

build_cbs_prices(example = TRUE)
build_cbs_prices(example = TRUE)

Build commodity balance sheets

Description

Construct commodity balance sheets (CBS) from raw FAOSTAT data. This is a convenience wrapper that chains the three pipeline steps:

.read_cbs() — read & reformat FAOSTAT CBS data.
.fix_cbs() — processing calibration, trade imputation, destiny filling, and final balancing.
.qc_cbs() — flag data-quality anomalies.

Usage

build_commodity_balances(
  primary_all,
  start_year = 1850,
  end_year = 2023,
  smooth_carry_forward = FALSE,
  example = FALSE,
  historical_data = NULL,
  .fixed_data = NULL
)
build_commodity_balances(
  primary_all,
  start_year = 1850,
  end_year = 2023,
  smooth_carry_forward = FALSE,
  example = FALSE,
  historical_data = NULL,
  .fixed_data = NULL
)

Arguments

primary_all

A tibble of primary production, as returned by build_primary_production().

start_year

Integer. First year to include. Default 1850.

end_year

Integer. Last year to include. Default 2023.

smooth_carry_forward

Logical. If TRUE, carry-forward tails are replaced with a linear trend. Default FALSE.

example

Logical. If TRUE, return a small hardcoded example tibble instead of reading remote data. Default FALSE.

historical_data

Optional harmonized historical CBS or production rows to add before the CBS historical extension. May be a data frame or a path to a parquet/csv file. CBS-shaped rows should provide year, value, one of area_code or polity_area_code, one of item_cbs_code or item_prod_code, and preferably element. Production-shaped rows without element are accepted as production when their unit is tonnes. Default NULL.

.fixed_data

Optional tibble with the same structure as the output of the internal .read_cbs() |> .fix_cbs() steps. When supplied, primary_all is ignored and the pipeline skips directly to .qc_cbs(). Default NULL.

Value

A tibble in long format with columns: year, legacy numeric area_code, numeric polity_area_code, reporting_polity_code, reporting_polity_name, reporting_polity_has_geometry, item_cbs_code, element (e.g. "production", "import", "food"), value, source, and fao_flag.

Examples

build_commodity_balances(example = TRUE)
build_commodity_balances(example = TRUE)

Build a constant-territory time series for a reference year's boundaries

Description

Estimates a time series of a quantity over a fixed set of territorial boundaries — the polities active in ref_year — from data reported under the changing historical boundaries of each data year.

Country borders change over time, so there is no raw constant-territory series: a 1900 figure for "Austria-Hungary" is not a figure for present-day Austria. This function estimates one by spatial reallocation (dasymetric areal interpolation):

For each data year, the value reported by each source polity is spread over that polity's own extent for that year across a regular grid, weighted by a covariate density (e.g. gridded cropland or population; uniform = plain areal weighting).
The grid is then re-aggregated to the ref_year target boundaries: a target's estimate is the sum of grid mass falling inside it.
Target territory not covered by any source with data in that year is imputed — its grid cells still carry covariate mass, so they are filled at a donor intensity (value per unit covariate) rather than left at zero. The fraction of a target's covariate mass that had to be imputed is reported as imputed_share, an honest confidence signal.

The estimate is only as good as the covariate: supply the same gridded surface used elsewhere in WHEP spatialization (cropland for crop output, population for demographic series, livestock density for animals). With covariate = NULL the method reduces to area-weighted areal interpolation.

Usage

build_constant_territory_series(
  data,
  ref_year,
  polities = NULL,
  covariate = NULL,
  resolution = 25000,
  donor = c("regional", "none"),
  crs_equal_area = 6933,
  max_cells = 2e+06,
  verbose = TRUE
)
build_constant_territory_series(
  data,
  ref_year,
  polities = NULL,
  covariate = NULL,
  resolution = 25000,
  donor = c("regional", "none"),
  crs_equal_area = 6933,
  max_cells = 2e+06,
  verbose = TRUE
)

Arguments

data

A data frame of reported values with columns:

year: integer data year.
polity_code: the source polity that reported the value (must be active in year and carry a polygon).
value: numeric value (summed if a polity appears more than once).

ref_year

Integer. Target boundaries are the polities active in this year (⁠start_year <= ref_year <= end_year⁠).

polities

An sf of polity polygons with polity_code, start_year, end_year and geometry. Defaults to get_polity_geometries().

covariate

NULL (uniform density, i.e. area weighting) or a function ⁠function(centroids_sf, year) -> numeric⁠ returning a non-negative density per grid-cell centroid (centroids are supplied in crs_equal_area).

resolution

Grid cell size, in metres of crs_equal_area. Default 25000 (25 km). Smaller is more accurate but slower.

donor

Gap-imputation rule: "regional" (default) fills uncovered target cells at the region-wide value-per-covariate intensity of the sources with data that year; "none" leaves them at zero (covered-only).

crs_equal_area

EPSG code of an equal-area CRS used for gridding and areas. Default 6933 (NSIDC EASE-Grid 2.0 Global).

max_cells

Safety cap on grid cells per year (default 2e6). Aborts if the source/target extent would exceed it (usually a stray continent-scale target); restrict polities, coarsen resolution, or raise this.

verbose

Logical; emit progress/warnings.

Value

A tibble, one row per (ref_year-target, data year):

target_polity_code, year
value: constant-territory estimate (covered + imputed)
covered: mass from cells overlapping a source with data
imputed: mass added for uncovered cells
imputed_share: covariate fraction imputed (0 = fully observed)
n_sources: number of source polities contributing that year

Examples

# Self-contained toy: two adjacent square polities. Only "P1" reports a
# value in 1900, so when the series is rebuilt onto the boundaries active in
# `ref_year` 2000 (both polities), "P2" is imputed from "P1"'s intensity.
make_square <- function(xmin, ymin, side) {
  sf::st_polygon(list(rbind(
    c(xmin, ymin),
    c(xmin + side, ymin),
    c(xmin + side, ymin + side),
    c(xmin, ymin + side),
    c(xmin, ymin)
  )))
}
polities <- sf::st_sf(
  polity_code = c("P1", "P2"),
  start_year = c(1800L, 1800L),
  end_year = c(2025L, 2025L),
  geometry = sf::st_sfc(
    make_square(0, 0, 2),
    make_square(2, 0, 2),
    crs = 4326
  )
)
reported <- tibble::tibble(
  year = 1900L,
  polity_code = "P1",
  value = 100
)
build_constant_territory_series(
  reported,
  ref_year = 2000,
  polities = polities,
  resolution = 50000,
  verbose = FALSE
)
# Self-contained toy: two adjacent square polities. Only "P1" reports a
# value in 1900, so when the series is rebuilt onto the boundaries active in
# `ref_year` 2000 (both polities), "P2" is imputed from "P1"'s intensity.
make_square <- function(xmin, ymin, side) {
  sf::st_polygon(list(rbind(
    c(xmin, ymin),
    c(xmin + side, ymin),
    c(xmin + side, ymin + side),
    c(xmin, ymin + side),
    c(xmin, ymin)
  )))
}
polities <- sf::st_sf(
  polity_code = c("P1", "P2"),
  start_year = c(1800L, 1800L),
  end_year = c(2025L, 2025L),
  geometry = sf::st_sfc(
    make_square(0, 0, 2),
    make_square(2, 0, 2),
    crs = 4326
  )
)
reported <- tibble::tibble(
  year = 1900L,
  polity_code = "P1",
  value = 100
)
build_constant_territory_series(
  reported,
  ref_year = 2000,
  polities = polities,
  resolution = 50000,
  verbose = FALSE
)

Build per-crop physical cropland extension.

Description

Convert gridded crop harvested area into per-crop physical land area and aggregate it to commodity-balance items, producing a land extension keyed by ⁠(year, area_code, item_cbs_code)⁠ for the FABIO footprint model.

The gridded land-use pipeline (build_gridded_landuse()) distributes FAOSTAT harvested area across grid cells; per-crop totals therefore conserve to harvested area, which over-counts multi-cropped land and under-counts fallow. This function turns that harvested area into physical occupied land:

"cropland_apportion" (default): within each cell, the cell's physical cropland (cropland_ha, from LUH2) is split across crops in proportion to their share of the cell's harvested area. Per-crop physical area then conserves to physical cropland rather than to harvested area, capturing both double-cropping (scaled down) and fallow (resting land charged to the crops the rotation supports) at the resolution of the grid.
"intensity_divide": each crop's harvested area is divided by the cell multi-cropping intensity (mc_rainfed, mc_irrigated). Requires multicropping.

Unlike a single country-level cropping-intensity factor applied uniformly to every crop, both methods distribute physical cropland by the actual spatial pattern of each crop.

Coverage note: with "cropland_apportion" the per-country crop total is bounded by the LUH2 cropland layer, which can under-represent perennial or plantation crops (e.g. oil palm, rubber) classified outside cropland; such crops may receive less land than a harvested-area baseline implies.

Usage

build_crop_land_extension(
  gridded_crops,
  gridded_cropland,
  items_prod_full = whep::items_prod_full,
  method = c("cropland_apportion", "intensity_divide"),
  multicropping = NULL
)
build_crop_land_extension(
  gridded_crops,
  gridded_cropland,
  items_prod_full = whep::items_prod_full,
  method = c("cropland_apportion", "intensity_divide"),
  multicropping = NULL
)

Arguments

gridded_crops

Tibble of gridded crop harvested area, the crop-level output of build_gridded_landuse() (built without CFT aggregation). Must have columns lon, lat, year, area_code, item_prod_code, rainfed_ha, irrigated_ha.

gridded_cropland

Tibble of physical cropland per cell. Must have columns lon, lat, year, cropland_ha.

items_prod_full

Crosswalk from production items to commodity-balance items. Defaults to items_prod_full. Must have columns item_prod_code and item_cbs_code.

method

Physical-area conversion method. One of "cropland_apportion" (default) or "intensity_divide".

multicropping

Tibble of per-cell multi-cropping intensity, required for method = "intensity_divide". Must have columns lon, lat, mc_rainfed, mc_irrigated (and optionally year).

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (physical land area in hectares), and method_land (the chosen method).

Examples

gridded_crops <- tibble::tribble(
  ~lon, ~lat, ~year, ~area_code, ~item_prod_code, ~rainfed_ha, ~irrigated_ha,
  0.25, 50.25, 2000L, 1L, 15L, 600, 0,
  0.25, 50.25, 2000L, 1L, 27L, 200, 0,
  0.75, 50.25, 2000L, 1L, 15L, 400, 0
)
gridded_cropland <- tibble::tribble(
  ~lon, ~lat, ~year, ~cropland_ha,
  0.25, 50.25, 2000L, 1000,
  0.75, 50.25, 2000L, 500
)
items <- tibble::tribble(
  ~item_prod_code, ~item_cbs_code,
  15L, 2511L,
  27L, 2805L
)
build_crop_land_extension(gridded_crops, gridded_cropland, items_prod_full = items)
gridded_crops <- tibble::tribble(
  ~lon, ~lat, ~year, ~area_code, ~item_prod_code, ~rainfed_ha, ~irrigated_ha,
  0.25, 50.25, 2000L, 1L, 15L, 600, 0,
  0.25, 50.25, 2000L, 1L, 27L, 200, 0,
  0.75, 50.25, 2000L, 1L, 15L, 400, 0
)
gridded_cropland <- tibble::tribble(
  ~lon, ~lat, ~year, ~cropland_ha,
  0.25, 50.25, 2000L, 1000,
  0.75, 50.25, 2000L, 500
)
items <- tibble::tribble(
  ~item_prod_code, ~item_cbs_code,
  15L, 2511L,
  27L, 2805L
)
build_crop_land_extension(gridded_crops, gridded_cropland, items_prod_full = items)

Build the crop/soil N2O extension.

Description

Estimate IPCC 2019 Tier 1 nitrous-oxide emissions from nitrogen applied to managed soils, as a footprint extension keyed by ⁠(year, area_code, item_cbs_code)⁠ in kilograms of carbon-dioxide equivalent (CO2e). This is the soil-N2O analogue of build_livestock_ghg_extension() and feeds build_footprint() / compute_footprint() the same way.

Three nitrogen inputs to soil are included:

Synthetic fertiliser (F_SN): FAOSTAT reports it only as a country total (tonnes N per area_code per year), so it is allocated to crops in proportion to each crop's harvested area within the country-year (from get_primary_production()).
Applied manure (F_ON): FAOSTAT "Manure applied to soils (N content)" country total, allocated to crops by harvested area as for F_SN.
Crop residues (F_CR): the dry matter of above-ground residues returned to soil (from get_primary_residues(), net of the removed fraction) times the crop's residue nitrogen content (IPCC 2019 Table 11.1a).

N2O is then estimated with IPCC 2019 Refinement (Vol 4, Ch 11) Tier 1 factors (climate-aggregated): direct EF1 = 0.010; indirect via volatilisation EF4 = 0.010 applied to the volatilised fraction (FracGASF = 0.11 for synthetic, FracGASM = 0.21 for manure; crop residues do not volatilise, Eq 11.9); indirect via leaching FracLEACH = 0.24 times EF5 = 0.011. N2O-N is converted to N2O by 44/28 and to CO2e with the chosen GWP100.

Manure deposited by grazing animals (F_PRP, which uses the grazing EF3 on pasture) and below-ground residue N are further Tier 1 inputs not yet included.

Usage

build_crop_soil_n2o_extension(
  gwp = c("ar6", "ar5", "ar4"),
  residue_removed_frac = 0.45,
  data = list(),
  example = FALSE
)
build_crop_soil_n2o_extension(
  gwp = c("ar6", "ar5", "ar4"),
  residue_removed_frac = 0.45,
  data = list(),
  example = FALSE
)

Arguments

gwp

100-year global warming potential standard for N2O, "ar6" (default, 273), "ar5" (265) or "ar4" (298).

residue_removed_frac

Fraction of above-ground crop residue removed from the field (for feed, fuel or construction) and therefore not returned to soil. Defaults to 0.45, a global mid-range value; country-specific removal (gleam_fracremove) is a future refinement.

data

Optional named list of pre-loaded inputs to avoid remote reads: primary_prod (get_primary_production(), for harvested area), fertilizer (the faostat-fertilizer-nutrients pin), manure (the faostat-emissions-livestock pin) and primary_residues (get_primary_residues()). Each falls back to its reader when absent.

example

If TRUE, return a small fixture instead of reading remote data. Defaults to FALSE.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (soil N2O in kilograms CO2e) and method_soil_n2o.

Examples

build_crop_soil_n2o_extension(example = TRUE)
build_crop_soil_n2o_extension(example = TRUE)

Build a per-crop physical land extension from CROPGRIDS.

Description

Convert FAOSTAT harvested area into per-crop physical cropland using CROPGRIDS, then return a land extension keyed by ⁠(year, area_code, item_cbs_code)⁠ for the FABIO footprint model.

CROPGRIDS (Tang et al. 2024) reports, per crop and country, both harvested area and crop (physical) area for 2020. Their ratio is a genuinely per-crop multi-cropping correction — e.g. rice ~0.81 (heavily double-cropped), most other crops ~0.95-1.0 (single-cropped) — which a single country-level cropping-intensity factor, or a cell-level apportionment by harvested share, cannot reproduce. This function applies that per-(area, item) physical / harvested ratio to WHEP harvested area in each year: $physical = harvested \times (physical_{cg} / harvested_{cg})$ .

Note: CROPGRIDS physical area is the land where each crop actually grows; it excludes fallow land (unlike fallow-inclusive cropland-apportionment), so totals are typically a few percent below harvested area.

Coverage: crops absent from CROPGRIDS (notably the FAOSTAT fodder items and a few minor crops) have no physical/harvested ratio and fall back to a ratio of 1 (physical = harvested, no multi-cropping correction). The share of harvested area hitting this fallback is reported via a warning.

Usage

build_cropgrids_land_extension(
  harvested = NULL,
  cropgrids = NULL,
  source = c("cropgrids", "cropgrids_fallow"),
  max_ratio = 1.5,
  min_cropgrids_ha = 100
)
build_cropgrids_land_extension(
  harvested = NULL,
  cropgrids = NULL,
  source = c("cropgrids", "cropgrids_fallow"),
  max_ratio = 1.5,
  min_cropgrids_ha = 100
)

Arguments

harvested

Tibble of harvested area with columns year, area_code, item_cbs_code, harvested_ha. If NULL, built from get_primary_production() (unit == "ha").

cropgrids

Tibble of national crop areas with columns area_code, item_cbs_code, physical_ha, harvested_ha. If NULL, the remote pin selected by source is read via whep_read_file().

source

Which CROPGRIDS pin to read when cropgrids is NULL: "cropgrids" (cropgrids-land: physical crop area, excludes fallow) or "cropgrids_fallow" (cropgrids-fallow-land: physical area with rotational fallow attributed to crops by attribute_fallow_to_crops()). Also recorded in method_land.

max_ratio

Cap on the per-area physical/harvested ratio (default 1.5). CROPGRIDS occasionally pairs a normal physical area with a near-zero harvested area for minor/aggregate crops, yielding a spurious ratio in the hundreds; physical area cannot realistically exceed harvested by more than the fallow share, so the ratio is clamped here.

min_cropgrids_ha

Minimum CROPGRIDS harvested area (ha, default 100) for a per-area physical/harvested ratio to be trusted. CROPGRIDS leaves rounding stubs of a few hectares for marginal crop-country pairs whose ratio is unreliable; below this floor the crop falls through to the global per-item ratio instead.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (physical land area in hectares), and method_land.

Examples

harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2000L, 33L, 2511L, 1000,
  2000L, 33L, 2807L, 500
)
cropgrids <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~physical_ha, ~harvested_ha,
  33L, 2511L, 990, 1000,
  33L, 2807L, 400, 500
)
build_cropgrids_land_extension(harvested, cropgrids)
harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2000L, 33L, 2511L, 1000,
  2000L, 33L, 2807L, 500
)
cropgrids <- tibble::tribble(
  ~area_code, ~item_cbs_code, ~physical_ha, ~harvested_ha,
  33L, 2511L, 990, 1000,
  33L, 2807L, 400, 500
)
build_cropgrids_land_extension(harvested, cropgrids)

Build detailed bilateral trade matrix

Description

Construct the detailed bilateral trade matrix (DTM) from the FAOSTAT Detailed Trade Matrix pin. Reports trade flows between pairs of countries with their trade shares, aggregated to polity level and mapped to CBS item codes.

Optionally extends the time series by joining with commodity balance sheet years and gap-filling country shares via linear interpolation.

Usage

build_detailed_trade(
  raw_trade = NULL,
  cbs = NULL,
  min_share = 1e-04,
  extend_time = FALSE,
  example = FALSE
)
build_detailed_trade(
  raw_trade = NULL,
  cbs = NULL,
  min_share = 1e-04,
  extend_time = FALSE,
  example = FALSE
)

Arguments

raw_trade

A data.table or tibble of raw FAOSTAT bilateral trade data. If NULL (default), the data is read from the "faostat-trade-bilateral" pin.

cbs

A tibble of commodity balance sheets in wide format, as returned by build_commodity_balances() or get_wide_cbs(). Required when extend_time = TRUE.

min_share

Numeric. Partners with a country share below this threshold are dropped when extending time. Default 0.0001.

extend_time

Logical. If TRUE, extend the time series using CBS years and linear interpolation of country shares. Default FALSE.

example

Logical. If TRUE, return a small example tibble without downloading remote data. Default FALSE.

Value

A tibble with columns:

year: Integer year.
area_code: Numeric polity code of the reporter country.
area_code_partner: Numeric polity code of the partner country.
element: Either "import" or "export".
item_cbs_code: Numeric CBS item code.
unit: Measurement unit ("tonnes" or "heads").
value: Trade quantity.
country_share: Share of total trade for this partner.

Examples

build_detailed_trade(example = TRUE)
build_detailed_trade(example = TRUE)

Build the livestock energy-use CO2 footprint extension (meat only).

Description

Aggregate GLEAM 3.0 on-farm (direct) and feed-production (embedded) energy use into a footprint extension keyed by ⁠(year, area_code, item_cbs_code)⁠, expressed in kilograms of carbon-dioxide equivalent (CO2e). This is the energy slice of the livestock greenhouse-gas basket and is designed to be summed with build_livestock_ghg_extension() (enteric and manure CH4/N2O), which keys on the same live-animal sectors.

The GLEAM energy emission factors are expressed per kilogram of live weight (see gleam_energy_use_ef), which is well defined for meat but not for milk or eggs, so the extension covers meat only: bovine (item_cbs_code 961 non-dairy cattle and 946 buffalo), sheep (976) and goat (1016), pig (1049 and 1051) and broiler-chicken (1053) meat. Milk and eggs keep their CH4/N2O but get no energy CO2.

For each meat group the live weight produced is recovered from FAOSTAT carcass production divided by a GLEAM dressing fraction (gleam_dressing_percentages), multiplied by a per-country energy intensity (embedded + direct), and then attributed to the contributing live-animal sectors in proportion to their slaughtered head counts. Because GLEAM reports its factors by production system and climate zone but the package has no country-level system or climate shares, the intensities are collapsed to one value per country by an unweighted mean across systems and climate zones; this choice is recorded in method_energy.

Usage

build_energy_co2_extension(method = c("gleam"), data = list(), example = FALSE)
build_energy_co2_extension(method = c("gleam"), data = list(), example = FALSE)

Arguments

method

Estimation method. Only "gleam" (default), the GLEAM 3.0 per-live-weight factors, is currently available.

data

Optional named list of pre-loaded inputs to avoid remote reads: primary_prod (the get_primary_production() output). It falls back to its reader when absent.

example

If TRUE, return a small fixture instead of reading remote data. Defaults to FALSE.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (energy-use emissions in kilograms CO2e) and method_energy (e.g. "GLEAM_3.0_energy_meat").

Examples

build_energy_co2_extension(example = TRUE)
build_energy_co2_extension(example = TRUE)

Build a per-crop physical land extension with FAO fallow-inclusive arable land.

Description

Turn per-crop harvested-derived physical area into a fallow-inclusive physical land extension whose arable-crop total reconciles to FAO's physical Arable land and whose perennial-crop total reconciles to FAO's physical Permanent crops (get_arable_permanent_land()), per ⁠(area_code, year)⁠.

This is the FAO-land-base analogue of build_cropgrids_land_extension()(source = "cropgrids_fallow"). The existing method takes the fallow magnitude from FAOSTAT "Temporary fallow" (item 6640, a sparse and, for many rain-fed economies, absent series) applied to a single CROPGRIDS 2020 snapshot. Here the fallow magnitude is the physical arable land that carried no harvest in that specific year, ⁠FAO Arable land - sum(cropped arable physical)⁠, so a drought year's resting cropland is charged to the crops whose rotation it supports and the arable-crop footprint totals match FAO's land survey in every year (see the Tunisia/Portugal motivation in get_arable_permanent_land()).

Reconciliation, per ⁠(area_code, year)⁠:

Arable crops (items_prod_full$Herb_Woody != "Woody"): rotational fallow max(0, arable_ha - S) (with S the cropped arable physical total) is distributed with attribute_fallow_to_crops() using fallow_weights, so the arable total reaches arable_ha. Where the cropped physical already exceeds arable_ha (heavy multi-cropping, or inflated fodder harvested area) there is no fallow to add and the arable crops are scaled down to arable_ha instead, the physical-container correction. Either way the arable total equals FAO arable_ha by construction.
Perennial crops (Herb_Woody == "Woody") receive no fallow and are scaled so their total equals FAO permanent_ha, preserving the within-group physical pattern. A positive target without a corresponding arable crop row or positive perennial base area is reported as an error because it cannot be reconciled without inventing a crop allocation.

This is the crop-side default of the land-balance footprint (build_land_balance_footprint()).

Usage

build_fao_arable_fallow_extension(
  harvested = NULL,
  arable_permanent = NULL,
  base_extension = NULL,
  fallow_weights = NULL,
  temporary_grassland = NULL,
  items_prod_full = whep::items_prod_full
)
build_fao_arable_fallow_extension(
  harvested = NULL,
  arable_permanent = NULL,
  base_extension = NULL,
  fallow_weights = NULL,
  temporary_grassland = NULL,
  items_prod_full = whep::items_prod_full
)

Arguments

harvested

Tibble of harvested area with columns year, area_code, item_cbs_code, harvested_ha. If NULL, built from get_primary_production() (unit == "ha"); passing a cached harvested table avoids that rebuild.

arable_permanent

Tibble of FAO physical land base with columns area_code, year, arable_ha, permanent_ha. If NULL, get_arable_permanent_land() is called for the years present in base_extension.

base_extension

Tibble of cropped (fallow-excluding) per-crop physical area with columns year, area_code, item_cbs_code, impact_u. If NULL, built with build_cropgrids_land_extension()(source = "cropgrids") from harvested.

fallow_weights

Tibble of area_code, item_cbs_code, weight giving the within-country fallow allocation weight, e.g. from gridded_fallow_weights() (the recommended agro-climatic, rainfed-gated weight). If NULL, fallow is distributed in proportion to each arable crop's cropped physical area (perennials always excluded). The cropped-area fallback is used independently for an area when it has no usable supplied weights, a non-finite or negative supplied weight, or a non-positive total.

temporary_grassland

Tibble of grassland occupation in the build_grassland_land_extension() schema (area_code, year, item_cbs_code, impact_u); its CBS 3002 rows are the temporary grassland netted out of the arable target so ordinary crops plus CBS 3002 reconcile to FAO Arable land (see the temporary-grassland section). If NULL (default) it is built with build_grassland_land_extension()(grassland_metric = "occupation") so netting still applies (correct but slow); supply the table to skip that rebuild, or pass one with no CBS 3002 rows to opt out.

items_prod_full

Crosswalk used to classify item_cbs_code as arable or perennial via Herb_Woody. Defaults to items_prod_full.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (fallow-inclusive physical land in hectares), and method_land ("fao_arable_fallow").

Temporary grassland (no double-count)

FAO's Arable land total includes temporary meadows and pastures — temporary grassland is part of cropland, not grassland. That land is also reported separately as CBS 3002 (⁠Temporary grassland⁠) by build_grassland_land_extension(), so summing both extensions naively would count it twice. Pass that grassland occupation as temporary_grassland and its CBS 3002 is netted out of the arable target before reconciling ordinary crops, enforcing the invariant per ⁠(area_code, year)⁠ ⁠ordinary crop occupation (incl. fallow) + CBS 3002 = FAO Arable land⁠. The land-balance footprint (build_land_balance_footprint()) does exactly this, passing the grassland occupation it has already built. When temporary_grassland is NULL (default) the grassland occupation extension is built internally so netting still happens — correct but slow, since that build reruns much of the pipeline; supply the table to avoid the rebuild. Where modelled CBS 3002 exceeds FAO Arable land (survey vs. fodder-reconstruction mismatch) the arable target is clamped at 0 and a warning is emitted.

Examples

harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2020L, 1L, 2511L, 300, # wheat (arable)
  2020L, 1L, 2560L, 100 # coconuts (perennial)
)
base_extension <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~impact_u,
  2020L, 1L, 2511L, 300,
  2020L, 1L, 2560L, 100
)
arable_permanent <- tibble::tribble(
  ~area_code, ~year, ~arable_ha, ~permanent_ha,
  1L, 2020L, 500, 100
)
items <- tibble::tribble(
  ~item_cbs_code, ~Herb_Woody,
  2511L, "Herbaceous",
  2560L, "Woody"
)
temporary_grassland <- tibble::tribble(
  ~area_code, ~year, ~item_cbs_code, ~impact_u,
  1L, 2020L, 3002L, 100 # temporary grassland netted out of arable
)
build_fao_arable_fallow_extension(
  harvested, arable_permanent, base_extension,
  temporary_grassland = temporary_grassland,
  items_prod_full = items
)
harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2020L, 1L, 2511L, 300, # wheat (arable)
  2020L, 1L, 2560L, 100 # coconuts (perennial)
)
base_extension <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~impact_u,
  2020L, 1L, 2511L, 300,
  2020L, 1L, 2560L, 100
)
arable_permanent <- tibble::tribble(
  ~area_code, ~year, ~arable_ha, ~permanent_ha,
  1L, 2020L, 500, 100
)
items <- tibble::tribble(
  ~item_cbs_code, ~Herb_Woody,
  2511L, "Herbaceous",
  2560L, "Woody"
)
temporary_grassland <- tibble::tribble(
  ~area_code, ~year, ~item_cbs_code, ~impact_u,
  1L, 2020L, 3002L, 100 # temporary grassland netted out of arable
)
build_fao_arable_fallow_extension(
  harvested, arable_permanent, base_extension,
  temporary_grassland = temporary_grassland,
  items_prod_full = items
)

Build livestock feed demand.

Description

Estimate the dry-matter feed demand of each livestock category: the first stage of get_feed_intake(), exposed on its own. Demand is national, per ⁠(year, area_code, livestock_category)⁠, and is computed before any matching against feed supply, so it can be audited or reused (for example in land or nitrogen footprints) independently of the allocation.

Usage

build_feed_demand(
  demand_tier = c("ipcc", "fcr"),
  by = c("category", "feed_type"),
  example = FALSE
)
build_feed_demand(
  demand_tier = c("ipcc", "fcr"),
  by = c("category", "feed_type"),
  example = FALSE
)

Arguments

demand_tier

Demand-estimation tier. "ipcc" (default) uses the IPCC Tier-2 energy model for the ruminant species, Bouwman feed-conversion ratios for pigs and poultry, and Krausmann per-head intake for draft and other species. "fcr" uses the Bouwman / Krausmann magnitude for every species. The method actually used for each row is recorded in method_demand.

by

Output grain. "category" (default) returns the per-livestock category demand. "feed_type" splits it across feed types and returns the feed_demand table that redistribute_feed() consumes, so the two compose: build_feed_demand(by = "feed_type") |> redistribute_feed(feed_avail).

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

With by = "category", a tibble with one row per ⁠(year, area_code, livestock_category)⁠:

year: The year of the demand.
area_code: The country code. For code details see e.g. add_area_name().
livestock_category: The feed-demand grouping of livestock (e.g. Cattle_milk, Cattle_meat, Pigs, Poultry).
demand_dm_t: Dry-matter feed demand in tonnes.
method_demand: The demand method(s) used, e.g. ipcc_tier2_energy, bouwman_fcr or krausmann_per_head (a +-joined set for a mixed category whose animals used different methods).

With by = "feed_type", the demand split across feed types as the redistribute_feed() feed_demand contract: year, territory, sub_territory, livestock_category, item_cbs_code, feed_group, feed_quality, demand_dm_t, fixed_demand.

Examples

build_feed_demand(example = TRUE)
build_feed_demand(example = TRUE, by = "feed_type")
build_feed_demand(example = TRUE)
build_feed_demand(example = TRUE, by = "feed_type")

Build local (per-cell) feed intake, chunked by year.

Description

Runs the redistribute_feed local path (0.5-degree cell grain) one year at a time, so the per-cell allocation stays within memory and the full multi-year run is restartable. By default it sources pinned LPJmL-derived grass availability and pinned gridded livestock inputs. Pass run_dir, grass_availability, grass_availability_path, or input_dir to use custom local inputs instead.

Usage

build_feed_intake_local(
  years = NULL,
  out_dir = NULL,
  demand_tier = c("ipcc", "fcr"),
  feed_mode = c("historical", "scenario"),
  overwrite = FALSE,
  example = FALSE,
  run_dir = NULL,
  input_dir = NULL,
  grass_availability = NULL,
  grass_availability_path = NULL
)
build_feed_intake_local(
  years = NULL,
  out_dir = NULL,
  demand_tier = c("ipcc", "fcr"),
  feed_mode = c("historical", "scenario"),
  overwrite = FALSE,
  example = FALSE,
  run_dir = NULL,
  input_dir = NULL,
  grass_availability = NULL,
  grass_availability_path = NULL
)

Arguments

years

Integer vector of years to build. Default NULL builds every year present in the production data.

out_dir

Directory to write per-year ⁠feed_intake_local_<year>⁠ parquet files to. If NULL, the bound result is returned in memory (only practical for a few years).

demand_tier

Demand-estimation tier, "ipcc" (default) or "fcr".

feed_mode

Whether to distribute surplus feed availability. "historical" (default) suppresses the surplus-distribution pass: the CBS feed element is treated as realised consumption, so leftover availability is not dumped onto variable-demand livestock (which would inflate non-grass intake). "scenario" distributes the surplus.

overwrite

Re-run years whose output file already exists. Default FALSE skips them so the batch is restartable.

example

If TRUE, return a small example output without sourcing the remote and gridded data. Default is FALSE.

run_dir

Optional path to a finished local LPJmL output directory holding pft_npp.nc and cftfrac.nc. If NULL, pinned grass availability is used unless grass_availability or grass_availability_path is supplied.

input_dir

Optional directory holding locally prepared spatialization inputs. If NULL, pinned gridded livestock/spatial inputs are used.

grass_availability

Optional already-derived grass availability tibble/data frame passed to build_grass_availability_lpjml().

grass_availability_path

Optional path to an already-derived grass availability artifact passed to build_grass_availability_lpjml().

Value

When out_dir is NULL, a tibble in the get_feed_intake() contract plus a sub_territory (0.5-degree cell) column. Otherwise, invisibly, the written file paths.

Examples

build_feed_intake_local(example = TRUE)
build_feed_intake_local(example = TRUE)

Compute a footprint end-to-end from an extension table.

Description

Trace a long-format environmental extension table through the supply chain for one or more years and return a tidy footprint. This wraps the three steps that the footprint driver scripts used to repeat inline: build (or reuse) the input-output model with build_io_model(), align the extension to each year's sector labels with align_extension(), and trace it with compute_footprint().

The extension table is the output of any ⁠build_*_extension()⁠ builder, such as build_grassland_land_extension() or build_livestock_ghg_extension(): rows keyed by year, area_code and item_cbs_code, with the pressure magnitude in value_col.

Usage

build_footprint(
  extension,
  years = NULL,
  io = NULL,
  method = c("mass", "value"),
  value_col = "impact_u",
  ...
)
build_footprint(
  extension,
  years = NULL,
  io = NULL,
  method = c("mass", "value"),
  value_col = "impact_u",
  ...
)

Arguments

extension

Long-format extension tibble with columns year, area_code, item_cbs_code and the column named by value_col.

years

Years to compute. Defaults to the distinct years present in extension. Ignored when io is supplied.

io

Optional pre-built build_io_model() result (a tibble with one row per year). Supply it to reuse one model across several extensions instead of rebuilding it. When NULL (default), it is built for years.

method

Co-product allocation method passed to build_io_model(), "mass" (default) or "value". Ignored when io is supplied (the model already encodes its allocation).

value_col

Name of the extension magnitude column, "impact_u" by default.

...

Further arguments passed to compute_footprint() (e.g. conserve_extensions, report_conservation).

Value

A tibble of footprint flows as returned by compute_footprint(), with an added year column.

Examples

io <- tibble::tibble(
  year = 2000L,
  Z = list(matrix(c(0, 5, 10, 0), nrow = 2)),
  X = list(c(100, 200)),
  Y = list(matrix(c(85, 195), ncol = 1)),
  labels = list(tibble::tibble(
    index = 1:2,
    area_code = c(1L, 1L),
    item_cbs_code = c(1L, 2L)
  )),
  fd_labels = list(tibble::tibble(area_code = 1L, fd_col = "food"))
)
extension <- tibble::tibble(
  year = 2000L,
  area_code = 1L,
  item_cbs_code = c(1L, 2L),
  impact_u = c(50, 30)
)
build_footprint(extension, io = io)
io <- tibble::tibble(
  year = 2000L,
  Z = list(matrix(c(0, 5, 10, 0), nrow = 2)),
  X = list(c(100, 200)),
  Y = list(matrix(c(85, 195), ncol = 1)),
  labels = list(tibble::tibble(
    index = 1:2,
    area_code = c(1L, 1L),
    item_cbs_code = c(1L, 2L)
  )),
  fd_labels = list(tibble::tibble(area_code = 1L, fd_col = "food"))
)
extension <- tibble::tibble(
  year = 2000L,
  area_code = 1L,
  item_cbs_code = c(1L, 2L),
  impact_u = c(50, 30)
)
build_footprint(extension, io = io)

Build grazable grass availability.

Description

Multi-method wrapper for the grass forage supply ceiling that feeds allocation. The default lpjml method reads pinned LPJmL-derived managed-grassland net primary production/availability unless custom artifact data, a custom artifact path, or run_dir points to local inputs; coefficient applies a per-area grass-yield coefficient and is not yet implemented (it needs a grass_yield_coef dataset).

Usage

build_grass_availability(method = c("lpjml", "coefficient"), ...)
build_grass_availability(method = c("lpjml", "coefficient"), ...)

Arguments

method

Grass-availability method, "lpjml" or "coefficient".

...

Passed to the selected method's builder, e.g. build_grass_availability_lpjml().

Value

A tibble of grass availability with a method_grass column recording the method used.

Examples

build_grass_availability(method = "lpjml", example = TRUE)
build_grass_availability(method = "lpjml", example = TRUE)

Build grazable grass availability from an LPJmL run.

Description

Reads managed-grassland net primary production/availability (the LPJmL grassland CFT) from the pinned WHEP artifact by default. Pass availability or availability_path to use a custom already-derived artifact; pass run_dir to read a finished local LPJmL run instead and convert NPP to grazable above-ground dry-matter availability, the forage supply ceiling for feed allocation. Availability is the production flux, not the realised grazing off-take (the off-take is the intake-validation target, not the supply).

Usage

build_grass_availability_lpjml(
  run_dir = NULL,
  years = NULL,
  first_year = 1901L,
  shares = grass_access_shares(),
  example = FALSE,
  availability = NULL,
  availability_path = NULL
)
build_grass_availability_lpjml(
  run_dir = NULL,
  years = NULL,
  first_year = 1901L,
  shares = grass_access_shares(),
  example = FALSE,
  availability = NULL,
  availability_path = NULL
)

Arguments

run_dir

Path to the LPJmL run output directory holding pft_npp.nc and cftfrac.nc (the ⁠scenario_*⁠ output folder). If unset, the pinned lpjml-grass-availability artifact is used.

years

Integer vector of calendar years to read.

first_year

First calendar year of the run's output time axis.

shares

Accessibility and conversion parameters from grass_access_shares().

example

If TRUE, return a small fixture instead of reading a run.

availability

Optional already-derived grass availability tibble/data frame. Takes precedence over pinned data and run_dir.

availability_path

Optional path to an already-derived grass availability artifact (.parquet, .csv, or .rds). Takes precedence over pinned data and run_dir.

Value

A tibble with lon, lat, year, grass_npp_gc_m2, grass_avail_dm_t_ha and grass_avail_dm_t.

Examples

build_grass_availability_lpjml(example = TRUE)
build_grass_availability_lpjml(example = TRUE)

Build the native grassland land extension.

Description

Produce a grassland land extension keyed by ⁠(year, area_code, item_cbs_code)⁠, replacing the grassland rows that used to come from the external land_fp pin.

Two area sources are available, selected with source:

"luh2" (default): permanent and temporary grassland area (item_cbs 3000 and 3002, LUH2 pasture and rangeland) taken from build_primary_production(). This shares the gridded LUH2 land-use basis used by the crop land extensions and by livestock spatialisation. Rotational fallow (item_cbs 3003) is excluded because the cropgrids_fallow crop extension already attributes fallow to crops, so counting it here too would double count it.
"faostat_pasture": FAOSTAT "Permanent meadows and pastures" area (Land Use item 6655), the statistics-based basis comparable to most published footprint studies.

Two metrics are available, selected with grassland_metric:

"occupation" (default): the full grassland area is charged as occupied land.
"active_grazing": grassland is capped at the area implied by actual grazing intake (the "grass" feed in get_feed_intake()) divided by a usable grass yield, so ungrazed or marginal rangeland is not charged.

Usage

build_grassland_land_extension(
  source = c("luh2", "faostat_pasture"),
  grassland_metric = c("occupation", "active_grazing"),
  usable_grass_yield_dm_t_ha = 2.06,
  data = list(),
  example = FALSE
)
build_grassland_land_extension(
  source = c("luh2", "faostat_pasture"),
  grassland_metric = c("occupation", "active_grazing"),
  usable_grass_yield_dm_t_ha = 2.06,
  data = list(),
  example = FALSE
)

Arguments

source

Grassland area source, "luh2" (default) or "faostat_pasture".

grassland_metric

Grassland land metric, "occupation" (default) or "active_grazing".

usable_grass_yield_dm_t_ha

Usable grass yield in dry-matter tonnes per hectare, used only by "active_grazing". Defaults to 2.06.

data

Optional named list of pre-loaded inputs to avoid remote reads: primary_prod (for source = "luh2"), landuse (the faostat-landuse pin, for source = "faostat_pasture") and feed_intake (for grassland_metric = "active_grazing"). Each falls back to its reader (get_primary_production(), whep_read_file(), get_feed_intake()) when absent.

example

If TRUE, return a small fixture instead of reading remote data. Defaults to FALSE.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (grassland area in hectares) and method_grassland (the chosen metric).

Examples

build_grassland_land_extension(example = TRUE)
build_grassland_land_extension(example = TRUE)

Build a grazing-land footprint by forward feed-allocation.

Description

End-to-end grazing-land footprint for one year on real WHEP data. Grazing land (the grassland extension) is pushed forward onto livestock meat and milk via allocate_grazing_to_products(), then routed to consuming countries through the bilateral meat-trade network with compute_footprint_balance(). This attributes grazing land to the meat and dairy consumption that drives it – the chain the Leontief footprint leaks because grass is non-productive and not traded.

The inputs are assembled from build_grassland_land_extension(), get_feed_intake(), get_primary_production() and get_bilateral_trade(). Supply any of them through data to reuse cached inputs or to test without remote reads.

intake_basis selects the intake that splits grazing land across animals. The default "grazer_forage" uses every grazing animal's grass plus roughage-residue intake, because the feed model classes the forage of major extensive-grazing countries (for example Australia, the United States and Argentina) as residues rather than grass; keying on the "grass" feed type alone would silently strip that land. "grass" reproduces the narrower grass-feed basis for sensitivity analysis. Land in countries with no grazer intake at all cannot be attributed and is reported via a warning.

Usage

build_grazing_feed_footprint(
  year,
  products = c("all", "meat_milk"),
  intake_basis = c("grazer_forage", "grass"),
  data = list(),
  example = FALSE
)
build_grazing_feed_footprint(
  year,
  products = c("all", "meat_milk"),
  intake_basis = c("grazer_forage", "grass"),
  data = list(),
  example = FALSE
)

Arguments

year

Year to build the footprint for.

products

Co-product split passed to allocate_grazing_to_products(). "all" (default) or "meat_milk".

intake_basis

Intake used to split land across animals, "grazer_forage" (default) or "grass".

data

Optional named list of pre-built inputs, any of grass_land, grazer_intake, livestock_production and trade. Each falls back to its builder when absent.

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with area_code (consuming country), item_cbs_code (meat or milk item), value (embodied grazing land) and method ("grazing_feed_allocation").

Examples

build_grazing_feed_footprint(example = TRUE)
build_grazing_feed_footprint(example = TRUE)

Build gridded landuse dataset

Description

Disaggregate country-level FAOSTAT crop harvested areas to a 0.5-degree grid. This reproduces the core spatialization workflow of the LandInG toolbox, adapted to WHEP conventions and tidy data structures.

The algorithm follows three main steps:

Each crop's country total is distributed to grid cells proportionally to a spatial reference pattern (e.g. Monfreda) weighted by gridded cropland extent (e.g. LUH2/HYDE).
If total allocated harvested area in any cell exceeds its capacity (cropland times multi-cropping suitability), excess is iteratively redistributed using a logit-based transformation.
Individual crops are aggregated into crop functional types (CFTs).

Usage

build_gridded_landuse(
  country_areas,
  crop_patterns,
  gridded_cropland,
  country_grid,
  config = list()
)
build_gridded_landuse(
  country_areas,
  crop_patterns,
  gridded_cropland,
  country_grid,
  config = list()
)

Arguments

country_areas

A tibble with country-level crop harvested areas. Expected columns:

year: Integer year.
area_code: Country code (numeric, matching WHEP polities).
item_prod_code: FAOSTAT item code for the crop.
harvested_area_ha: Total harvested area in hectares.
irrigated_area_ha: Irrigated harvested area in hectares (optional, defaults to 0).

crop_patterns

A tibble with per-cell spatial crop patterns. Expected columns:

lon: Longitude of cell centre.
lat: Latitude of cell centre.
item_prod_code: FAOSTAT item code.
harvest_fraction: Cropping intensity (Monfreda area divided by reference cropland).

gridded_cropland

A tibble with per-cell cropland extent. Expected columns:

lon: Longitude of cell centre.
lat: Latitude of cell centre.
year: Integer year.
cropland_ha: Total cropland area in hectares.
irrigated_ha: Irrigated cropland in hectares (optional, defaults to 0).

country_grid

A tibble mapping grid cells to countries. Expected columns:

lon: Longitude of cell centre.
lat: Latitude of cell centre.
area_code: Country code. Optional columns:
cell_area_frac (or area_frac): Fraction of the physical cell belonging to this polity compartment. Defaults to 1.
polycell_id, cell_id: Stable compartment/cell identifiers preserved in outputs when present.
year or validity intervals (valid_from/valid_to, start_year/end_year, from_year/to_year) for historical, time-varying polity overlays.

config

Named list of optional extras. Unknown keys raise an error. Recognised keys:

years: Integer vector of years to spatialize. If NULL (default), all years present in country_areas are processed. When supplied, country_areas, gridded_cropland, and type_cropland are filtered to this set before processing.
cft_mapping: A tibble mapping FAOSTAT items to CFT names (item_prod_code, cft_name). If NULL, no CFT aggregation is performed and individual crop results are returned.
type_cropland: A tibble with per-cell, per-year, per-type cropland (lon, lat, year, luh2_type, type_ha, type_irrig_ha). When provided alongside type_mapping, each crop is allocated only into cells containing its LUH2 type. If NULL, falls back to total cropland.
type_mapping: A tibble (item_prod_code, luh2_type) that maps each crop to its LUH2 type. If NULL, type-aware allocation is disabled even when type_cropland is provided.
multicropping: A tibble with per-cell multi-cropping suitability factors. Required columns: lon, lat, mc_rainfed, mc_irrigated. An optional year column keys factors to year (one row per cell per year); when present, the table is filtered to the current year before the capacity constraint is applied. When absent, the table is treated as a static spatial layer applied to every year. If NULL (default), the capacity constraint still runs with mc_rainfed = mc_irrigated = 1 (harvested area capped at physical cropland).
max_iterations: Maximum iterations for the redistribution loop. Default: 1000L.
expansion_threshold: Iteration number after which crops are allowed to expand into cells without an existing pattern. Default: 100L.

Value

A tibble with gridded crop (or CFT) harvested areas. Columns:

lon, lat: Cell centre coordinates.
year: Integer year.
area_code: WHEP polity code for this cell compartment.
polity_area_code, reporting_polity_code, reporting_polity_name, reporting_polity_has_geometry: Polity metadata for area_code.
polycell_id, cell_id: Preserved when supplied in country_grid.
crop_name or cft_name: Crop or CFT identifier.
rainfed_ha: Rainfed harvested area in the cell.
irrigated_ha: Irrigated harvested area in the cell.

Methodology

This function reimplements the spatial crop allocation from the LandInG toolbox (Ostberg et al. 2023, doi:10.5194/gmd-16-3375-2023) with the following extensions:

LUH2 crop-functional-type constraints (type_cropland + type_mapping parameters) restrict each crop to cells containing its LUH2 type (c3ann, c4ann, c3per, c3nfx). LandInG allocates to total cropland without type constraints.
MIRCA2000 crop-specific irrigated fractions (Portmann et al. 2010) for irrigation distribution, falling back to LUH2-proportional allocation.

Data sources

Country areas: FAOSTAT QCL via build_primary_production
Crop patterns: EarthStat / Monfreda et al. (2008)
Gridded cropland: LUH2 v2h (Hurtt et al. 2020)
Irrigation: MIRCA2000 (Portmann et al. 2010) + LUH2

Examples

# Minimal example with toy data
country_areas <- tibble::tribble(
  ~year, ~area_code, ~item_prod_code, ~harvested_area_ha,
  2000L, 1L, 15L, 1000
)
crop_patterns <- tibble::tribble(
  ~lon, ~lat, ~item_prod_code, ~harvest_fraction,
  0.25, 50.25, 15L, 0.6,
  0.75, 50.25, 15L, 0.4
)
gridded_cropland <- tibble::tribble(
  ~lon, ~lat, ~year, ~cropland_ha,
  0.25, 50.25, 2000L, 800,
  0.75, 50.25, 2000L, 500
)
country_grid <- tibble::tribble(
  ~lon, ~lat, ~area_code,
  0.25, 50.25, 1L,
  0.75, 50.25, 1L
)
build_gridded_landuse(
  country_areas, crop_patterns, gridded_cropland, country_grid,
  config = list(years = 2000L)
)
# Minimal example with toy data
country_areas <- tibble::tribble(
  ~year, ~area_code, ~item_prod_code, ~harvested_area_ha,
  2000L, 1L, 15L, 1000
)
crop_patterns <- tibble::tribble(
  ~lon, ~lat, ~item_prod_code, ~harvest_fraction,
  0.25, 50.25, 15L, 0.6,
  0.75, 50.25, 15L, 0.4
)
gridded_cropland <- tibble::tribble(
  ~lon, ~lat, ~year, ~cropland_ha,
  0.25, 50.25, 2000L, 800,
  0.75, 50.25, 2000L, 500
)
country_grid <- tibble::tribble(
  ~lon, ~lat, ~area_code,
  0.25, 50.25, 1L,
  0.75, 50.25, 1L
)
build_gridded_landuse(
  country_areas, crop_patterns, gridded_cropland, country_grid,
  config = list(years = 2000L)
)

Build gridded livestock dataset

Description

Disaggregate country-level FAOSTAT livestock stocks and emissions to a 0.5-degree grid. Each species group uses a tailored spatial proxy:

Ruminants (cattle, buffalo, sheep/goats, equines): LUH2 managed pasture (pastr) plus rangeland (range), optionally weighted by a static manure-intensity reference (West et al. 2014).
Confined animals (pigs, poultry): LUH2 aggregate cropland, reflecting intensive farming co-location with crop production.
Range specialists (camels): LUH2 rangeland only.
Mixed (other animals): 50/50 blend of pasture and cropland.

For each country, year, and species group the function distributes the national total proportionally to cell-level proxy weights:

$\text{cell} = \frac{w_i}{\sum_{j \in \text{country}} w_j} \times T_{\text{country}}$

where $w_i$ is the proxy weight in cell $i$ (land-use hectares times optional reference-pattern intensity) and $T$ is the country total (heads or emissions).

Methodology

Livestock spatialization is not covered by LandInG (Ostberg et al. 2023), which focuses on crops only. The approach here extends the LandInG framework by using the same LUH2-based spatial proxies (pasture, rangeland, cropland) for livestock distribution.

Country-level data comes from build_primary_production() (stocks) and the faostat-emissions-livestock pin (CH4/N2O emissions), with predecessor redistribution and pre-1961 backfill already applied.

The Zenodo livestock density input (Heinke 2025, doi:10.5281/zenodo.14946695) provides an alternative calibrated LSU/ha reference for use with the glw_density parameter.

Data sources and references

Source	Use
FAOSTAT Production_Livestock (FAO 2024)	Country-level heads
FAOSTAT Emissions_livestock (FAO 2024)	Enteric CH4, manure CH4/N2O
LUH2 v2h (Hurtt et al. 2020)	Time-varying pasture + cropland
West et al. (2014)	Static manure-N intensity reference
GLW3 (Gilbert et al. 2018)	Species-specific density (optional)
Heinke (2025)	Calibrated LSU/ha density (optional)
IPCC 2006/2019	N-excretion rates, emission factors

Usage

build_gridded_livestock(
  livestock_data,
  gridded_pasture,
  gridded_cropland,
  country_grid,
  species_proxy = NULL,
  manure_pattern = NULL,
  glw_density = NULL,
  grass_productivity = NULL,
  years = NULL
)
build_gridded_livestock(
  livestock_data,
  gridded_pasture,
  gridded_cropland,
  country_grid,
  species_proxy = NULL,
  manure_pattern = NULL,
  glw_density = NULL,
  grass_productivity = NULL,
  years = NULL
)

Arguments

livestock_data

A tibble with country-level livestock data. Required columns:

year: Integer year.
area_code: Country code (WHEP polities).
species_group: Livestock functional-type name (e.g. "cattle", "pigs", "poultry").
heads: Live animal count (number of head). Any additional numeric columns (e.g. enteric_ch4_kt, manure_ch4_kt, manure_n2o_kt, manure_n_mg) are distributed to the grid using the same proportional weights as heads.

gridded_pasture

A tibble with annual gridded pasture extent. Required columns:

lon, lat: Cell centre coordinates (0.5 degree).
year: Integer year.
pasture_ha: Managed pasture area in hectares (LUH2 pastr).
rangeland_ha: Rangeland area in hectares (LUH2 range).

gridded_cropland

A tibble with annual gridded cropland extent. Required columns:

lon, lat: Cell centre coordinates.
year: Integer year.
cropland_ha: Total cropland area in hectares.

country_grid

A tibble mapping grid cells to countries. Required columns:

lon, lat: Cell centre coordinates.
area_code: Country code. Optional columns:
cell_area_frac (or area_frac): Fraction of the physical cell belonging to this polity compartment. Defaults to 1.
polycell_id, cell_id: Stable compartment/cell identifiers preserved in outputs when present.
year or validity intervals (valid_from/valid_to, start_year/end_year, from_year/to_year) for historical, time-varying polity overlays.

species_proxy

A tibble mapping each species_group to its spatial proxy type: "pasture", "cropland", "rangeland", or "mixed". Required columns:

species_group: Group name (must match livestock_data).
spatial_proxy: One of "pasture", "cropland", "rangeland", or "mixed". If NULL, a default mapping is used (see Details).

manure_pattern

A tibble with static manure-intensity weights (e.g. from West et al. 2014). Optional. Expected columns:

lon, lat: Cell centre coordinates.
manure_intensity: Relative intensity (kg N per ha or similar). Values are used multiplicatively with the land-use proxy. If NULL, land-use weights are used alone.

glw_density

A tibble with species-specific gridded livestock density from GLW3 (Gilbert et al. 2018). Optional. Expected columns:

lon, lat: Cell centre coordinates.
species_group: Must match livestock_data.
density: Heads per cell (reference year ~2010). If provided, this replaces the LUH2-based proxy for the matching groups, while still being scaled by LUH2 time trends. If NULL, LUH2 proxies are used for all groups.

grass_productivity

A tibble with grass productivity per cell (lon, lat, grass_npp) from read_lpjml_grass_productivity(). Optional. When provided, it multiplies the pasture/rangeland (grazer) proxy weights so animals follow grass production rather than area alone; cropland/mixed proxies are unaffected. If NULL, area proxies are used alone.

years

Integer vector of years to spatialize. If NULL (default), all years present in livestock_data are processed. When supplied, livestock_data, gridded_pasture, and gridded_cropland are filtered to this set before processing.

Value

A tibble with gridded livestock data. Columns:

lon, lat: Cell centre coordinates.
area_code: WHEP polity code for this cell compartment.
polity_area_code, reporting_polity_code, reporting_polity_name, reporting_polity_has_geometry: Polity metadata for area_code.
polycell_id, cell_id: Preserved when supplied in country_grid.
year: Integer year.
species_group: Livestock functional type.
heads: Allocated live animal count.
Any additional numeric columns from livestock_data (e.g. enteric_ch4_kt, manure_ch4_kt).

Examples

# Minimal example with toy data
livestock_data <- tibble::tribble(
  ~year, ~area_code, ~species_group, ~heads,
  2000L,         1L,       "cattle",   5000
)
gridded_pasture <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~pasture_ha, ~rangeland_ha,
   0.25, 50.25, 2000L,         600,           200,
   0.75, 50.25, 2000L,         400,           100
)
gridded_cropland <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~cropland_ha,
   0.25, 50.25, 2000L,          800,
   0.75, 50.25, 2000L,          500
)
country_grid <- tibble::tribble(
  ~lon,  ~lat, ~area_code,
   0.25, 50.25,         1L,
   0.75, 50.25,         1L
)
build_gridded_livestock(
  livestock_data, gridded_pasture, gridded_cropland, country_grid
)
# Minimal example with toy data
livestock_data <- tibble::tribble(
  ~year, ~area_code, ~species_group, ~heads,
  2000L,         1L,       "cattle",   5000
)
gridded_pasture <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~pasture_ha, ~rangeland_ha,
   0.25, 50.25, 2000L,         600,           200,
   0.75, 50.25, 2000L,         400,           100
)
gridded_cropland <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~cropland_ha,
   0.25, 50.25, 2000L,          800,
   0.75, 50.25, 2000L,          500
)
country_grid <- tibble::tribble(
  ~lon,  ~lat, ~area_code,
   0.25, 50.25,         1L,
   0.75, 50.25,         1L
)
build_gridded_livestock(
  livestock_data, gridded_pasture, gridded_cropland, country_grid
)

Build a hectare-year (land-occupation) crop land extension.

Description

Per-crop land occupation in hectare-years (the LCA m2*year convention): the land-time each crop's production ties up, $occupation_i = harvested_i \times L_i/12 + fallow_i$ .

The first term is active growing occupation — harvested area times mean cycle length $L_i$ (months, from MIRCA2000). Because it uses harvested area, a field double-cropped twice contributes both cycles, and a long-cycle perennial contributes close to a full year. The second term is the rotational fallow attributed to the crop, which occupies land the whole year while it rests.

This is "active" occupation: land is charged only while a crop is growing on it or resting in its rotation, so the national total falls below physical cropland area (which also counts off-season idle). It is distinct from, and complementary to, the physical-area extensions (build_cropgrids_land_extension()): those measure the field area each crop holds; this measures how much land-time each crop's activity occupies. Short single-cropped crops and intensively double-cropped staples occupy less land-time per hectare than long-cycle and perennial crops.

Usage

build_hayr_land_extension(
  harvested = NULL,
  fallow = NULL,
  season = NULL,
  base = c("cropgrids_fallow", "cropgrids")
)
build_hayr_land_extension(
  harvested = NULL,
  fallow = NULL,
  season = NULL,
  base = c("cropgrids_fallow", "cropgrids")
)

Arguments

harvested

Tibble of harvested area with columns year, area_code, item_cbs_code, harvested_ha. If NULL, built from get_primary_production() (and reused to build the fallow term).

fallow

Tibble of attributed rotational fallow with columns year, area_code, item_cbs_code, fallow_ha. If NULL: for base = "cropgrids_fallow" it is the difference between the fallow-inclusive and cropped CROPGRIDS physical extensions; for base = "cropgrids" it is zero (growing occupation only).

season

Tibble of mean crop cycle length with columns item_cbs_code, season_months (strictly positive, unique keys). If NULL, the packaged MIRCA2000 season table is used. Crops with no season are given the median cycle length.

base

"cropgrids_fallow" (default, include rotational fallow) or "cropgrids" (growing occupation only). Recorded in method_land as ⁠<base>_hayr⁠.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (land occupation in hectare-years), and method_land.

Examples

harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2000L, 1L, 2807L, 200, # rice, double-cropped (two harvests per field)
  2000L, 1L, 2511L, 100 # wheat, single-cropped
)
season <- tibble::tribble(
  ~item_cbs_code, ~season_months,
  2807L, 5,
  2511L, 8
)
fallow <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~fallow_ha,
  2000L, 1L, 2511L, 20
)
build_hayr_land_extension(harvested, fallow, season)
harvested <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~harvested_ha,
  2000L, 1L, 2807L, 200, # rice, double-cropped (two harvests per field)
  2000L, 1L, 2511L, 100 # wheat, single-cropped
)
season <- tibble::tribble(
  ~item_cbs_code, ~season_months,
  2807L, 5,
  2511L, 8
)
fallow <- tibble::tribble(
  ~year, ~area_code, ~item_cbs_code, ~fallow_ha,
  2000L, 1L, 2511L, 20
)
build_hayr_land_extension(harvested, fallow, season)

Build multi-regional input-output model.

Description

Construct a multi-regional input-output (MRIO) model from supply-use tables, bilateral trade, and commodity balance sheets. Uses the industry technology assumption to derive symmetric product-by-product tables.

The resulting matrices follow the FABIO methodology (Bruckner et al., 2019). Rows and columns of Z represent (country, item) pairs. Each entry Z[i,j] gives the intermediate flow from sector i to sector j.

Usage

build_io_model(
  supply_use = NULL,
  bilateral_trade = NULL,
  cbs = NULL,
  years = NULL,
  endogenize_losses = FALSE,
  method = c("mass", "value"),
  prices = NULL
)
build_io_model(
  supply_use = NULL,
  bilateral_trade = NULL,
  cbs = NULL,
  years = NULL,
  endogenize_losses = FALSE,
  method = c("mass", "value"),
  prices = NULL
)

Arguments

supply_use

Tibble from build_supply_use(). By default, this function calls build_supply_use() internally. Must have columns: year, area_code, proc_group, proc_cbs_code, item_cbs_code, type, value.

bilateral_trade

Tibble from get_bilateral_trade(). By default, this function calls get_bilateral_trade() internally. Must have columns: year, item_cbs_code, bilateral_trade (list-column of matrices).

cbs

Tibble from get_wide_cbs(). By default, this function calls get_wide_cbs() internally. Must have columns: year, area_code, item_cbs_code, production, import, export, stock_withdrawal, stock_addition, plus final demand columns (food, other_uses).

years

Numeric vector of years to compute, or NULL. If NULL, computes all years in the intersection of available data across inputs. If specified, must be a subset of available years.

endogenize_losses

Logical. If TRUE and cbs contains a losses column, losses are moved from final demand to the diagonal of Z (self-use), following the FABIO convention. The losses column is removed from Y and fd_labels. Defaults to FALSE.

method

Co-product allocation method. "mass" (default) splits a multi-output process's inputs across its products by physical mass; "value" splits them by economic value (mass times export price), so high-value co-products (e.g. oil over cake, meat over hides) carry a larger share of upstream pressures. A process whose co-products lack usable prices falls back to mass.

prices

Optional tibble of item prices as from build_cbs_prices() (year, element, item_cbs_code, price). Used only when method = "value"; built automatically when NULL.

Value

A tibble with one row per year and list-columns:

Z: Inter-industry flow matrix (product-by-product).
Y: Final demand matrix.
X: Total output vector.
labels: Tibble mapping row/column indices to area_code, item_cbs_code, and reporting polity metadata.
fd_labels: Tibble mapping each Y column to its area_code (consuming polity), fd_col (demand category, e.g. "food"), and reporting polity metadata. Pass to compute_footprint() as fd_labels to get a target_fd column in the footprint output.

Examples

su <- build_supply_use(example = TRUE)
btd <- get_bilateral_trade(example = TRUE)
cbs <- get_wide_cbs(example = TRUE)
build_io_model(su, btd, cbs)
su <- build_supply_use(example = TRUE)
btd <- get_bilateral_trade(example = TRUE)
cbs <- get_wide_cbs(example = TRUE)
build_io_model(su, btd, cbs)

Build a consumption land footprint by physical trade balance.

Description

End-to-end land-balance footprint for one year on real WHEP data: assemble production (primary crop production plus grass dry-matter derived from grassland area times yield), the bilateral trade network, and the crop-plus-grassland direct-land extension, then trace land to consumers with compute_footprint_balance(). This is the independent, non-Leontief estimator for stress-testing the multi-regional input-output footprint via compare_footprint_methods().

Grass items (item_cbs_code 3000 and 3002) are barely traded, so their land stays with the producing country: the balance, unlike the input-output model, does not route grass through the grass-to-livestock chain. Any production, trade or extension supplied explicitly is used as-is instead of being built, which is useful for reusing cached inputs or for testing.

Usage

build_land_balance_footprint(
  year,
  production = NULL,
  trade = NULL,
  extension = NULL,
  example = FALSE
)
build_land_balance_footprint(
  year,
  production = NULL,
  trade = NULL,
  extension = NULL,
  example = FALSE
)

Arguments

year

Year to build the footprint for.

production

Optional production tibble (area_code, item_cbs_code, value); built when NULL.

trade

Optional trade tibble (from_code, to_code, item_cbs_code, value); built when NULL.

extension

Optional direct-land tibble (area_code, item_cbs_code, value); built when NULL.

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble as returned by compute_footprint_balance().

Examples

build_land_balance_footprint(example = TRUE)
build_land_balance_footprint(example = TRUE)

Build the livestock greenhouse-gas emissions extension.

Description

Aggregate per-animal IPCC livestock emissions into a footprint extension keyed by ⁠(year, area_code, item_cbs_code)⁠, expressed in kilograms of carbon-dioxide equivalent (CO2e). This bridges the cohort-level emissions pipeline (calculate_livestock_emissions()) to the input-output grain used by build_io_model() and compute_footprint(), exactly like build_grassland_land_extension() does for land.

Live-animal head counts come from get_primary_production(), are bridged to IPCC species with prepare_livestock_emissions(), and the resulting enteric and manure emissions are converted to CO2e and summed back to the live-animal commodity sector (item_cbs_code, e.g. 961 for non-dairy cattle), which is itself a sector in build_io_model().

Two IPCC tiers are available, selected with tier:

1 (default): Tier 1 regional emission factors (IPCC 2019). It needs only species, country and head counts, so it is complete for every country in get_primary_production(). It covers enteric and manure methane and manure N2O (direct and indirect, from default per-head nitrogen excretion rates).
2: Tier 2 cohort energy balance (IPCC 2019). It derives enteric CH4 and manure N2O from a per-animal energy and nitrogen balance, for finer resolution, but requires cohort weight and diet inputs. Animals whose emissions cannot be resolved (missing diet or energy data) are dropped with a warning rather than entering the footprint as NA. Its per-head enteric and manure emissions now sit in the same range as the Tier 1 regional factors. Tier 1 remains the default because it is complete for every country in get_primary_production(), whereas Tier 2 needs cohort and diet inputs.

The CO2e conversion uses 100-year global warming potentials selected with gwp:

"ar6" (default): IPCC AR6 (2021) Table 7.15, biogenic CH4 = 27, N2O = 273.
"ar5": IPCC AR5 (2013), CH4 = 28, N2O = 265 (no climate-carbon feedback).
"ar4": IPCC AR4 (2007), CH4 = 25, N2O = 298.

Usage

build_livestock_ghg_extension(
  tier = 1,
  gwp = c("ar6", "ar5", "ar4"),
  data = list(),
  example = FALSE
)
build_livestock_ghg_extension(
  tier = 1,
  gwp = c("ar6", "ar5", "ar4"),
  data = list(),
  example = FALSE
)

Arguments

tier

IPCC tier, 1 (default) or 2.

gwp

100-year global warming potential standard, "ar6" (default), "ar5" or "ar4".

data

Optional named list of pre-loaded inputs to avoid remote reads: primary_prod (the get_primary_production() output). It falls back to its reader when absent.

example

If TRUE, return a small fixture instead of reading remote data. Defaults to FALSE.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (livestock emissions in kilograms CO2e) and method_ghg (the chosen tier and GWP standard, e.g. "IPCC_2019_Tier1_AR6").

Examples

build_livestock_ghg_extension(example = TRUE)
build_livestock_ghg_extension(example = TRUE)

Build livestock nutrient flows from realised feed intake.

Description

Top-level driver that traces the nitrogen, carbon and volatile solids in realised feed intake (the redistribute_feed() result) through livestock excretion, manure management and application to soil by land use and crop, plus the management-loss side-streams. It chains estimate_n_excretion(), split_manure_management(), apply_management_losses() and allocate_manure_to_land(); at the "subnational" resolution it additionally spills each cell's un-placeable surplus to neighbouring cells with allocate_manure_transport() before local disposal. Every method choice is recorded in a ⁠method_*⁠ provenance column and the nitrogen balance (excreted = applied + management losses) is conserved.

Usage

build_livestock_nutrient_flows(
  intake,
  resolution = "national",
  methods = list(),
  gridded = NULL
)
build_livestock_nutrient_flows(
  intake,
  resolution = "national",
  methods = list(),
  gridded = NULL
)

Arguments

intake

A tibble of realised feed intake (the redistribute_feed() result); see estimate_n_excretion() for the required columns. At "subnational" resolution sub_territory is the "lon_lat" cell id.

resolution

One of "global", "national" (default) or "subnational". Transport between cells runs only at "subnational".

methods

A named list of per-stage option lists, any of excretion, split, losses, allocation and transport, each forwarded to the matching pipeline function's options.

gridded

The land-surface layer (crops and optional grass tibbles) passed to allocate_manure_to_land(); required for the default "potential_uptake" cap. NULL is treated as an empty list.

Value

A named list with applied (manure applied per ⁠land_use x crop (x cell)⁠ with all ⁠method_*⁠ provenance columns), losses (management-loss side-streams per polity) and excretion (the per-category excretion totals).

Examples

intake <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~item_cbs_code, ~feed_quality, ~intake_dm_t,
  2020L, "ESP", NA, "Cattle_milk", 2513L, "high_quality", 200,
  2020L, "ESP", NA, "Cattle_milk", NA, "grass", 600
)
gridded <- list(
  crops = tibble::tribble(
    ~year, ~territory, ~sub_territory, ~crop, ~manure_n_receptivity, ~crop_n_cap,
    2020L, "ESP", NA, "barley", 6, 200,
    2020L, "ESP", NA, "wheat", 4, 200
  )
)
build_livestock_nutrient_flows(intake, gridded = gridded)
intake <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~item_cbs_code, ~feed_quality, ~intake_dm_t,
  2020L, "ESP", NA, "Cattle_milk", 2513L, "high_quality", 200,
  2020L, "ESP", NA, "Cattle_milk", NA, "grass", 600
)
gridded <- list(
  crops = tibble::tribble(
    ~year, ~territory, ~sub_territory, ~crop, ~manure_n_receptivity, ~crop_n_cap,
    2020L, "ESP", NA, "barley", 6, 200,
    2020L, "ESP", NA, "wheat", 4, 200
  )
)
build_livestock_nutrient_flows(intake, gridded = gridded)

Build primary item prices

Description

Compute prices for primary production items. Export trade prices are preferred; when unavailable, production value prices (gross production value divided by quantity) are used as fallback. Gaps are filled via linear interpolation.

Usage

build_primary_prices(
  primary_prod,
  value_of_production = NULL,
  trade_prices = NULL,
  example = FALSE
)
build_primary_prices(
  primary_prod,
  value_of_production = NULL,
  trade_prices = NULL,
  example = FALSE
)

Arguments

primary_prod

A tibble of primary production, as returned by build_primary_production() or get_primary_production().

value_of_production

A data frame with FAOSTAT Value of Production data. Must contain columns Item.Code (or item_code_prod), Element, Unit, Year, Value, Area.Code (or area_code). If NULL, only trade prices are used.

trade_prices

A tibble as returned by build_trade_prices(). If NULL, it is computed internally.

example

Logical. If TRUE, return a small example tibble. Default FALSE.

Value

A tibble with columns:

year: Integer year.
item_prod_code: Production item code (character).
price: Price in KDollars per tonne.

Examples

build_primary_prices(example = TRUE)
build_primary_prices(example = TRUE)

Build primary production dataset

Description

Construct the full primary production dataset from raw FAOSTAT inputs. This is a convenience wrapper that chains the pipeline steps:

.read_production() — read & reformat FAOSTAT data.
.fix_production() — apply Global-ported corrections.
.dedup_production() — keep one value per key across sources.
.qc_production() — flag data-quality anomalies on the surviving (deduplicated) values.

Usage

build_primary_production(
  start_year = 1850,
  end_year = 2023,
  smooth_carry_forward = FALSE,
  example = FALSE,
  show_duplicates = FALSE,
  historical_data = NULL,
  .raw_data = NULL
)
build_primary_production(
  start_year = 1850,
  end_year = 2023,
  smooth_carry_forward = FALSE,
  example = FALSE,
  show_duplicates = FALSE,
  historical_data = NULL,
  .raw_data = NULL
)

Arguments

start_year

Integer. First year to include. Default 1850.

end_year

Integer. Last year to include. Default 2023.

smooth_carry_forward

Logical. If TRUE, carry-forward tails are replaced with a linear trend. Default FALSE.

example

Logical. If TRUE, return a small hardcoded example tibble instead of reading remote data. Default FALSE.

show_duplicates

Logical. If TRUE, return only the rows that have competing sources in wide format (one column per source) for diagnostic comparison. Default FALSE.

historical_data

Optional harmonized historical production rows to add before the LUH2 historical extension. May be a data frame or a path to a parquet/csv file. Required semantic columns are year, item_prod_code, unit, value, and one of area_code or polity_area_code. Names such as item_prod_name, item_cbs_name, and source are used when present; WHEP item and area tables fill canonical names where possible. Observed historical rows are retained, and LUH2 proxy filling can use them as anchors. Default NULL.

.raw_data

Optional tibble with the same structure as the output of the internal .read_production() step. When supplied, the remote-data read is skipped entirely and the pipeline starts from .fix_production(). Columns required: year, area, area_code, item_prod, item_prod_code, item_cbs, item_cbs_code, live_anim, live_anim_code, unit, value, source. Default NULL.

Value

A tibble with the same columns as get_primary_production(): year, legacy numeric area_code, numeric polity_area_code, reporting_polity_code, reporting_polity_name, reporting_polity_has_geometry, item_prod_code, item_cbs_code, live_anim_code, unit, value, and source. Item names can be recovered via add_item_prod_name() and related helpers. When show_duplicates = TRUE, returns a wide tibble with one column per source showing the competing values.

Examples

build_primary_production(example = TRUE)
build_primary_production(example = TRUE)

Build processing coefficients

Description

Extract the final calibrated processing coefficients from the CBS building pipeline. These can be used independently for footprint calculations.

Usage

build_processing_coefs(
  cbs,
  start_year = 1850,
  end_year = 2023,
  example = FALSE
)
build_processing_coefs(
  cbs,
  start_year = 1850,
  end_year = 2023,
  example = FALSE
)

Arguments

cbs

A tibble of final CBS in wide format, as returned by build_commodity_balances().

start_year

Integer. First year to include. Default 1850.

end_year

Integer. Last year to include. Default 2023.

example

Logical. If TRUE, return a small hardcoded dataset for illustration without downloading data. Default FALSE.

Value

A tibble with columns: year, area_code, item_cbs_code_to_process, value_to_process, item_cbs_code_processed, initial_conversion_factor, initial_value_processed, conversion_factor_scaling, final_conversion_factor, final_value_processed.

Examples

build_processing_coefs(example = TRUE)
build_processing_coefs(example = TRUE)

Build residue feed availability for feed allocation.

Description

Turns the feed destiny of crop residues into the feed_avail contract consumed by redistribute_feed(): maps each crop to its residue commodity item, applies a feed-availability loss, and aggregates to year / territory / residue item.

Usage

build_residue_feed_avail(x, loss_fraction = 0.15, feed_scale = "national")
build_residue_feed_avail(x, loss_fraction = 0.15, feed_scale = "national")

Arguments

x

A tibble with item_prod_code, year, sub_territory and residue_feed_dm_t (from calculate_residue_destinies()).

loss_fraction

Fraction of the feed residue lost before intake (default 0.15).

feed_scale

Value for the feed_scale column (default "national").

Value

A tibble with the redistribute_feed() feed_avail columns: year, sub_territory, item_cbs_code, feed_group, feed_quality ("residues"), avail_dm_t and feed_scale.

Examples

tibble::tibble(
  item_prod_code = "15", year = 2000, sub_territory = "ESP",
  residue_feed_dm_t = 50
) |>
  build_residue_feed_avail()
tibble::tibble(
  item_prod_code = "15", year = 2000, sub_territory = "ESP",
  residue_feed_dm_t = 50
) |>
  build_residue_feed_avail()

Supply and use tables

Description

Create a table with processes, their inputs (use) and their outputs (supply).

Usage

build_supply_use(example = FALSE)
build_supply_use(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the supply and use data for processes. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: The code of the country where the data is from. For code details see e.g. add_area_name().
proc_group: The type of process taking place. It can be one of:
- crop_production: Production of crops and their residues, e.g. rice production, coconut production, etc.
- husbandry: Animal husbandry, e.g. dairy cattle husbandry, non-dairy cattle husbandry, layers chickens farming, etc.
- animal_draught: Annual useful work energy generated by draft animals.
- slaughtering: Slaughter of live animals into meat, edible offals, raw animal fats, and hides/skins.
- processing: Derived subproducts obtained from processing other items. The items used as inputs are those that have a non-zero processing use in the commodity balance sheet. See get_wide_cbs() for more details. In each process there is a single input. In some processes like olive oil extraction or soyabean oil extraction this might make sense. Others like alcohol production need multiple inputs (e.g. multiple crops work), so in this data there would not be a process like alcohol production but rather a virtual process like 'Wheat and products processing', giving all its possible outputs. This is a constraint because of how the data was obtained and might be improved in the future. See get_processing_coefs() for more details.
proc_cbs_code: The code of the main item in the process taking place. Together with proc_group, these two columns uniquely represent a process. The main item is predictable depending on the value of proc_group:
- crop_production: The code is from the item for which seed usage (if any) is reported in the commodity balance sheet, except for field co-products where the process uses the field aggregate item. For example, rice production uses the rice code, while seed cotton production uses the seed cotton code and supplies both cottonseed and cotton lint.
- husbandry: The code of the farmed animal, e.g. bees for beekeeping, non-dairy cattle for non-dairy cattle husbandry, etc.
- slaughtering: The code of the live animal being slaughtered, e.g. non-dairy cattle for cattle slaughter.
- processing: The code of the item that is used as input, i.e., the one that is processed to get other derived products. This uniquely defines a process within the group because of the nature of the data that was used, which you can see in get_processing_coefs().
For code details see e.g. add_item_cbs_name().
item_cbs_code: The code of the item produced or used in the process. Note that this might be the same value as proc_cbs_code, e.g., in rice production process for the row defining the amount of rice produced or the amount of rice seed as input, but it might also have a different value, e.g. for the row defining the amount of straw residue from rice production. For code details see e.g. add_item_cbs_name().
type: Can have two values:
- use: The given item is an input of the process.
- supply: The given item is an output of the process.
value: Quantity in the item's own unit. Most items are measured in tonnes; live animals are measured in heads; animal draught is measured as annual useful work energy (TJ).

Examples

build_supply_use(example = TRUE)
build_supply_use(example = TRUE)

Build global trade prices

Description

Compute global prices of traded items from FAOSTAT trade data. For each item and element (import/export), the price is KDollars / tonnes aggregated across all countries.

Usage

build_trade_prices(raw_trade = NULL, example = FALSE)
build_trade_prices(raw_trade = NULL, example = FALSE)

Arguments

raw_trade

A data.table or tibble of FAOSTAT bilateral trade data with columns year, item_trade, item_code_trade, unit, element, and value. Must include both quantity ("tonnes") and value ("1000 US$") rows. If NULL (default), the data is read from the "faostat-trade-bilateral" pin.

example

Logical. If TRUE, return a small example tibble. Default FALSE.

Value

A tibble with columns:

year: Integer year.
item_trade: Trade item name.
item_code_trade: Numeric FAOSTAT trade item code.
element: "import" or "export".
kdollars: Total trade value in thousand US dollars.
tonnes: Total trade quantity in tonnes.
price: Price in KDollars per tonne.

Examples

build_trade_prices(example = TRUE)
build_trade_prices(example = TRUE)

Estimate total biological nitrogen fixation.

Description

Sums the three BNF components: symbiotic crop legumes, symbiotic weeds/cover crops, and non-symbiotic free-living fixation, by running calculate_crop_bnf(), calculate_weed_bnf() and calculate_nonsymbiotic_bnf(). When a climate_type column is present, the climate-specific parameters from bnf_climate_params override the relevant defaults per climate type.

Usage

calculate_bnf(
  x,
  symbiotic_params = list(),
  nonsymbiotic_params = list(),
  soil_params = list()
)
calculate_bnf(
  x,
  symbiotic_params = list(),
  nonsymbiotic_params = list(),
  soil_params = list()
)

Arguments

x

A tibble carrying the required columns of all three component functions, optionally with a climate_type column.

symbiotic_params, nonsymbiotic_params, soil_params

Named lists passed to the component functions (see those functions).

Details

The weed component uses the weed_npp_n_t already present in x. In the standard crop-NPP chain this is non-zero only when calculate_crop_npp_components() has been run, or when callers supply weed NPP directly; calculate_npp_carbon_nitrogen() treats missing weed biomass as zero.

Value

The input tibble with all component columns plus fert_type ("BNF") and bnf_t (total BNF).

Examples

calculate_bnf(
  tibble::tibble(
    item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5,
    weed_npp_n_t = 4, land_use = "Cropland", legumes_seeded = 0,
    seeded_cover_crop_share = 0, area_ha = 40
  )
)
calculate_bnf(
  tibble::tibble(
    item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5,
    weed_npp_n_t = 4, land_use = "Cropland", legumes_seeded = 0,
    seeded_cover_crop_share = 0, area_ha = 40
  )
)

Calculate cohort and production system distribution.

Description

Distributes national herd totals across GLEAM-defined cohorts and production systems using gleam_livestock_categories and regional weight data.

Usage

calculate_cohorts_systems(data, system_shares = NULL)
calculate_cohorts_systems(data, system_shares = NULL)

Arguments

data

Dataframe with species, heads, and optionally iso3 or region.

system_shares

Optional dataframe with species_gen, system, system_share columns. If NULL, uses GLEAM defaults and routes dairy/non-dairy commodities to their matching production system. Supplying this overrides both, so the supplied shares are used verbatim.

Value

Dataframe expanded to cohort level with cohort, system, cohort_heads, and cohort_fraction columns.

Examples

tibble::tibble(
  species = "Cattle", heads = 10000,
  iso3 = "DEU"
) |>
  calculate_cohorts_systems()
tibble::tibble(
  species = "Cattle", heads = 10000,
  iso3 = "DEU"
) |>
  calculate_cohorts_systems()

Estimate symbiotic biological nitrogen fixation by crop legumes.

Description

Estimates symbiotic BNF from leguminous crops via two methods (an NPP-based estimate and the Anglade product-based estimate). Environmental modifiers (nitrogen inhibition, temperature, water) are applied only when their driver columns are present in x; an absent driver leaves that modifier at 1.

Usage

calculate_crop_bnf(x, symbiotic_params = list())
calculate_crop_bnf(x, symbiotic_params = list())

Arguments

x

A tibble with item_prod_code, crop_npp_n_t and product_n_t. Optional driver columns activate the modifiers: n_synth_kg_ha / n_org_kg_ha (nitrogen inhibition), temp_c (temperature), water_input_mm (or precip_mm + irrig_mm) and pet_mm (water).

symbiotic_params

Named list overriding the symbiotic-BNF parameters k_n_synth, k_n_org, t_opt, t_sigma, ai_threshold.

Value

The input tibble with the environmental factors, ndfa_adj, crop_bnf_t (NPP method), crop_bnf_anglade_t (Anglade method) and the per-product ratios bnf_product_ratio_npp / bnf_product_ratio_anglade.

Examples

calculate_crop_bnf(
  tibble::tibble(item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5)
)
calculate_crop_bnf(
  tibble::tibble(item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5)
)

Estimate total crop net primary production.

Description

Assembles total crop net primary production (product plus residue plus root dry matter) by running calculate_crop_residues() then calculate_crop_roots().

Usage

calculate_crop_npp(
  x,
  residue_method = "ensemble",
  root_method = "ensemble",
  weights = list(w_ipcc = 0.5, w_ref = 0.5)
)
calculate_crop_npp(
  x,
  residue_method = "ensemble",
  root_method = "ensemble",
  weights = list(w_ipcc = 0.5, w_ref = 0.5)
)

Arguments

x

A tibble with item_prod_code, production_t and area_ha, plus any optional adjustment columns used by the residue and root steps.

residue_method

Residue method passed to calculate_crop_residues().

root_method

Root method passed to calculate_crop_roots().

weights

Named list of ensemble weights; w_ipcc for residues and w_ref for roots (each 0-1, default 0.5).

Value

The input tibble with product_dm_t, yield_dm_t_ha, residue_dm_t, root_dm_t, crop_npp_dm_t, method_residue and method_root.

Examples

calculate_crop_npp(
  tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40)
)
calculate_crop_npp(
  tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40)
)

Estimate cropland NPP components including weeds.

Description

Assembles full cropland net primary production: scales weed biomass from potential NPP and the weed_npp_scaling factors, then partitions crop and weed biomass into dry matter, carbon and nitrogen via calculate_npp_carbon_nitrogen().

Usage

calculate_crop_npp_components(
  x,
  .by = NULL,
  potential = list(method = "lpjml")
)
calculate_crop_npp_components(
  x,
  .by = NULL,
  potential = list(method = "lpjml")
)

Arguments

x

A tibble with item_prod_code, area_ha, year, product_dm_t, residue_dm_t and root_dm_t (e.g. the output of calculate_crop_npp()). When npp_potential_dm_t_ha is absent it is computed via calculate_potential_npp() (which, for the default lpjml method, needs lon/lat).

.by

Optional character vector of grouping columns used to fill missing weed-scaling factors with the group mean.

potential

A named list selecting the potential-NPP source: method (default "lpjml") and lpjml options.

Details

The weed_npp_scaling table is taken from Spain_Hist and is flagged to_be_revised: it is Spain-specific and not validated for WHEP's global scope. A weed_scaling_to_be_revised column records this on the output and a one-time warning is emitted.

Value

The input tibble with weed dry matter, the dry-matter / nitrogen / carbon partition for crop, weeds and total, and weed_scaling_to_be_revised.

Examples

tibble::tibble(
  item_prod_code = "15", production_t = 100, area_ha = 40,
  year = 2000, npp_potential_dm_t_ha = 5
) |>
  calculate_crop_npp() |>
  calculate_crop_npp_components()
tibble::tibble(
  item_prod_code = "15", production_t = 100, area_ha = 40,
  year = 2000, npp_potential_dm_t_ha = 5
) |>
  calculate_crop_npp() |>
  calculate_crop_npp_components()

Estimate crop above-ground residue biomass.

Description

Estimates crop residue dry matter from production and area using an ensemble of an IPCC 2019 linear model and a bio_coefs residue:product ratio. The irrigation and modern-variety adjustments activate only when their driver columns are present in x.

Usage

calculate_crop_residues(
  x,
  method = c("ensemble", "ipcc", "ratio"),
  weights = list(w_ipcc = 0.5)
)
calculate_crop_residues(
  x,
  method = c("ensemble", "ipcc", "ratio"),
  weights = list(w_ipcc = 0.5)
)

Arguments

x

A tibble with item_prod_code, production_t (fresh matter) and area_ha. Optional water_regime enables the irrigation adjustment; optional year plus region_hanpp enable the modern-variety adjustment.

method

Residue method: "ensemble" (default, weighted IPCC + ratio), "ipcc" (IPCC linear only) or "ratio" (bio_coefs ratio only).

weights

Named list; w_ipcc is the IPCC weight in the ensemble (0-1, default 0.5), used only when method = "ensemble".

Value

The input tibble with product_dm_t, yield_dm_t_ha, residue_dm_t and method_residue.

Examples

calculate_crop_residues(
  tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40)
)
calculate_crop_residues(
  tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40)
)

Estimate crop below-ground (root) biomass.

Description

Estimates root dry matter from above-ground biomass using an ensemble of an IPCC root:shoot ratio and a reference below-ground biomass per hectare. The nitrogen-input and irrigation adjustments activate only when their driver columns are present in x.

Usage

calculate_crop_roots(
  x,
  method = c("ensemble", "root_shoot", "reference"),
  weights = list(w_ref = 0.5)
)
calculate_crop_roots(
  x,
  method = c("ensemble", "root_shoot", "reference"),
  weights = list(w_ref = 0.5)
)

Arguments

x

A tibble with item_prod_code, product_dm_t, residue_dm_t and area_ha. Optional n_input_kg_ha enables the nitrogen adjustment; optional water_regime enables the irrigation adjustment.

method

Root method: "ensemble" (default, weighted root:shoot + reference), "root_shoot" (root:shoot ratio only) or "reference" (reference below-ground biomass only).

weights

Named list; w_ref is the reference weight in the ensemble (0-1, default 0.5), used only when method = "ensemble".

Value

The input tibble with root_dm_t and method_root.

Examples

calculate_crop_roots(
  tibble::tibble(
    item_prod_code = "15",
    product_dm_t = 87.9,
    residue_dm_t = 135.75,
    area_ha = 40
  )
)
calculate_crop_roots(
  tibble::tibble(
    item_prod_code = "15",
    product_dm_t = 87.9,
    residue_dm_t = 135.75,
    area_ha = 40
  )
)

Calculate enteric methane emissions.

Description

Wrapper that selects Tier 1 or 2 for enteric CH4 based on data availability.

Usage

calculate_enteric_ch4(data, tier = NULL)
calculate_enteric_ch4(data, tier = NULL)

Arguments

data

Dataframe with species, heads. For Tier 2, also needs cohort, weight, and diet_quality. For Tier 1, iso3 is used to select regional emission factors.

tier

Integer 1 or 2. If NULL (default), auto-selects based on data completeness.

Value

Dataframe with all input columns preserved, plus:

method_enteric: tracking label ("IPCC_2019_Tier1" or "IPCC_2019_Tier2").
Tier 1: enteric_ef_kgch4 (emission factor), enteric_ch4_tier1 (total kg CH4).
Tier 2: gross_energy, ym_factor, enteric_ch4_per_head (kg CH4/head/yr), enteric_ch4_tier2 (total kg CH4).

Examples

tibble::tibble(
  species = "Cattle", heads = 1000, iso3 = "DEU"
) |>
  calculate_enteric_ch4(tier = 1)
tibble::tibble(
  species = "Cattle", heads = 1000, iso3 = "DEU"
) |>
  calculate_enteric_ch4(tier = 1)

Calculate all livestock emissions.

Description

Main dispatcher that runs the full IPCC 2019 livestock emissions pipeline: energy demand (Tier 2), enteric CH4, manure CH4, and manure N2O.

Selects tier automatically: Tier 2 when cohort-level data (weight, diet) are available; Tier 1 otherwise.

Usage

calculate_livestock_emissions(data, tier = NULL)
calculate_livestock_emissions(data, tier = NULL)

Arguments

data

Dataframe with at minimum species and heads. For Tier 2, also needs cohort, weight (or iso3), diet_quality, and production columns.

tier

Integer 1 or 2. If NULL (default), auto-selects based on data completeness.

Value

Dataframe with all emission columns, method tracking, and original data columns preserved.

Examples

tibble::tibble(
  species = "Dairy Cattle",
  cohort = "Adult Female",
  heads = 1000,
  weight = 600,
  diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  calculate_livestock_emissions() |>
  dplyr::select(species, cohort, heads,
    enteric_ch4_tier2, manure_ch4_tier2,
    manure_n2o_total)
tibble::tibble(
  species = "Dairy Cattle",
  cohort = "Adult Female",
  heads = 1000,
  weight = 600,
  diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  calculate_livestock_emissions() |>
  dplyr::select(species, cohort, heads,
    enteric_ch4_tier2, manure_ch4_tier2,
    manure_n2o_total)

Calculate LMDI decomposition.

Description

Performs LMDI (Log Mean Divisia Index) decomposition analysis with flexible identity parsing, automatic factor detection, and support for multiple periods and groupings. Supports sectoral decomposition using bracket notation for both summing and grouping operations.

Usage

calculate_lmdi(
  data,
  identity,
  identity_labels = NULL,
  time_var = year,
  periods = NULL,
  periods_2 = NULL,
  .by = NULL,
  rolling_mean = 1,
  output_format = "clean",
  verbose = TRUE
)
calculate_lmdi(
  data,
  identity,
  identity_labels = NULL,
  time_var = year,
  periods = NULL,
  periods_2 = NULL,
  .by = NULL,
  rolling_mean = 1,
  output_format = "clean",
  verbose = TRUE
)

Arguments

data

A data frame containing the variables for decomposition. Must include all variables specified in the identity, time variable, and any grouping variables.

identity

Character. Decomposition identity in format "target:factor1*factor2*...". The target appears before the colon, factors after, separated by asterisks. Supports explicit ratios with / and structural decomposition with ⁠[]⁠.

identity_labels

Character vector. Custom labels for factors to use in output instead of variable names. The first element labels the target, and subsequent elements label each factor in order. Default: NULL uses variable names as-is.

time_var

Unquoted name of the time variable column in the data. Default: year. Must be numeric or coercible to numeric.

periods

Numeric vector. Years defining analysis periods. Each consecutive pair defines one period. Default: NULL uses all available years.

periods_2

Numeric vector. Additional period specification for complex multi-period analyses. Default: NULL.

.by

Character vector. Grouping variables for performing separate decompositions. Default: NULL (single decomposition for all data).

rolling_mean

Numeric. Window size for rolling mean smoothing applied before decomposition. Default: 1 (no smoothing).

output_format

Character. Format of output data frame. Options: "clean" (default) or "total".

verbose

Logical. If TRUE (default), prints progress messages during decomposition.

Details

The LMDI method decomposes changes in a target variable into contributions from multiple factors using logarithmic mean weights. This implementation supports:

Flexible identity specification:

Automatic factor detection from identity string.
Support for ratio calculations (implicit division).
Sectoral aggregation with ⁠[]⁠ notation.
Sectoral grouping with {} notation.

Period analysis: The function can decompose changes over single or multiple periods. Periods are defined by consecutive pairs in the periods vector.

Grouping capabilities: Use .by to perform separate decompositions for different groups (e.g., countries, regions) while maintaining consistent factor structure.

Value

A tibble with LMDI decomposition results containing:

Time variables and grouping variables (if specified).
additive: Additive contributions (sum equals total change in target).
multiplicative: Multiplicative indices (product equals target ratio).
multiplicative_log: Log of multiplicative indices.
Period identifiers and metadata.

Identity Syntax

The identity parameter uses a special syntax to define decomposition:

Basic format: "target:factor1*factor2*factor3"

Simple decomposition (no sectors):

Basic: "emissions:gdp*(emissions/gdp)"
Complete: "emissions:(emissions/gdp)*(gdp/population)*population"

Understanding bracket notation:

Square brackets ⁠[]⁠ specify variables to sum across categories, enabling structural decomposition. The bracket aggregates values BEFORE calculating ratios.

Single-level structural decomposition:

"emissions:activity*(activity[sector]/activity)*(emissions[sector]/activity[sector])"
Creates 3 factors: Activity level, Sectoral structure, Sectoral intensity.

Multi-level structural decomposition:

Two levels: "emissions:activity*(activity[sector]/activity)*(activity[sector+fuel]/activity[sector])*(emissions[sector+fuel]/activity[sector+fuel])"
Creates 4 factors: Activity level, Sector structure, Fuel structure, Sectoral-fuel intensity.

Data Requirements

The input data frame must contain:

All variables mentioned in the identity.
The time variable (default: "year").
Grouping variables if using .by.
No missing values in key variables for decomposition periods.

Examples

# In these examples, 'activity' is a measure of scale
# (e.g., GDP in million USD) and 'intensity' is the target
# variable per unit activity (e.g., emissions per million USD).
# The units are illustrative; adapt to your context.
# --- Shared sample data ---
data_simple <- tibble::tribble(
  ~year, ~activity, ~intensity, ~emissions,
  2010,  1000,      0.10,       100,
  2011,  1100,      0.12,       132,
  2012,  1200,      0.09,       108,
  2013,  1300,      0.10,       130
)

# --- 1. Year-over-year decomposition (default) ---
# Decompose annual emission changes into activity and intensity effects.
# The additive column sums to the total change in emissions each period.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 2. Single baseline-to-end period ---
# Pass a two-element periods vector to get a single cumulative period
# instead of year-over-year results.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  periods = c(2010, 2013),
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 3. Year-over-year AND one cumulative summary period ---
# Use periods_2 to append an extra comparison period alongside the
# year-over-year results.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  periods = c(2010, 2011, 2012, 2013),
  periods_2 = c(2010, 2013),
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 4. Per-country decomposition with .by ---
# Separate LMDI runs per country; results are stacked with a country column.
data_countries <- tibble::tribble(
  ~year, ~country, ~activity, ~intensity, ~emissions,
  2010, "ESP", 1000, 0.10, 100,
  2011, "ESP", 1100, 0.11, 121,
  2012, "ESP", 1200, 0.10, 120,
  2010, "FRA", 2000, 0.05, 100,
  2011, "FRA", 2200, 0.05, 110,
  2012, "FRA", 2400, 0.05, 120
)

calculate_lmdi(
  data_countries,
  identity = "emissions:activity*intensity",
  time_var = year,
  .by = "country",
  verbose = FALSE
) |>
  dplyr::select(
    country,
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 5. Ratio notation ---
# Express factors as explicit ratios (e.g. intensity = emissions/activity).
# Factor labels in the output preserve the ratio form for clarity.
calculate_lmdi(
  data_simple,
  identity = "emissions:(emissions/activity)*activity",
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 6. Structural (sectoral) decomposition with [] notation ---
# Decomposes emissions into:
#   total_activity * sector_structure * sector_intensity
# [] sums the bracketed variable across sector before forming ratios,
# enabling proper structural decomposition.
data_sectors <- tibble::tribble(
  ~year, ~sector, ~activity, ~emissions,
  2010, "industry", 600, 60,
  2010, "transport", 400, 40,
  2011, "industry", 700, 63,
  2011, "transport", 500, 55
) |>
  dplyr::group_by(year) |>
  dplyr::mutate(total_activity = sum(activity)) |>
  dplyr::ungroup()

calculate_lmdi(
  data_sectors,
  identity = paste0(
    "emissions:",
    "total_activity*",
    "(activity[sector]/total_activity)*",
    "(emissions[sector]/activity[sector])"
  ),
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 7. Custom factor labels ---
# Replace raw variable names with readable labels for reporting.
# Supply one label per term (target first, then each factor in order).
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  identity_labels = c(
    "Total Emissions",
    "Activity Effect",
    "Intensity Effect"
  ),
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 8. Rolling mean smoothing before decomposition ---
# A 3-year rolling mean reduces noise in volatile series before
# computing LMDI weights. Edge years use partial windows (fewer
# than k observations) so no periods are lost.
data_smooth <- tibble::tibble(
  year      = 2010:2020,
  activity  = seq(1000, 2000, length.out = 11),
  intensity = rep(0.1, 11),
  emissions = seq(1000, 2000, length.out = 11) * 0.1
)

calculate_lmdi(
  data_smooth,
  identity = "emissions:activity*intensity",
  time_var = year,
  rolling_mean = 3,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )
# In these examples, 'activity' is a measure of scale
# (e.g., GDP in million USD) and 'intensity' is the target
# variable per unit activity (e.g., emissions per million USD).
# The units are illustrative; adapt to your context.
# --- Shared sample data ---
data_simple <- tibble::tribble(
  ~year, ~activity, ~intensity, ~emissions,
  2010,  1000,      0.10,       100,
  2011,  1100,      0.12,       132,
  2012,  1200,      0.09,       108,
  2013,  1300,      0.10,       130
)

# --- 1. Year-over-year decomposition (default) ---
# Decompose annual emission changes into activity and intensity effects.
# The additive column sums to the total change in emissions each period.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 2. Single baseline-to-end period ---
# Pass a two-element periods vector to get a single cumulative period
# instead of year-over-year results.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  periods = c(2010, 2013),
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 3. Year-over-year AND one cumulative summary period ---
# Use periods_2 to append an extra comparison period alongside the
# year-over-year results.
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  time_var = year,
  periods = c(2010, 2011, 2012, 2013),
  periods_2 = c(2010, 2013),
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 4. Per-country decomposition with .by ---
# Separate LMDI runs per country; results are stacked with a country column.
data_countries <- tibble::tribble(
  ~year, ~country, ~activity, ~intensity, ~emissions,
  2010, "ESP", 1000, 0.10, 100,
  2011, "ESP", 1100, 0.11, 121,
  2012, "ESP", 1200, 0.10, 120,
  2010, "FRA", 2000, 0.05, 100,
  2011, "FRA", 2200, 0.05, 110,
  2012, "FRA", 2400, 0.05, 120
)

calculate_lmdi(
  data_countries,
  identity = "emissions:activity*intensity",
  time_var = year,
  .by = "country",
  verbose = FALSE
) |>
  dplyr::select(
    country,
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 5. Ratio notation ---
# Express factors as explicit ratios (e.g. intensity = emissions/activity).
# Factor labels in the output preserve the ratio form for clarity.
calculate_lmdi(
  data_simple,
  identity = "emissions:(emissions/activity)*activity",
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 6. Structural (sectoral) decomposition with [] notation ---
# Decomposes emissions into:
#   total_activity * sector_structure * sector_intensity
# [] sums the bracketed variable across sector before forming ratios,
# enabling proper structural decomposition.
data_sectors <- tibble::tribble(
  ~year, ~sector, ~activity, ~emissions,
  2010, "industry", 600, 60,
  2010, "transport", 400, 40,
  2011, "industry", 700, 63,
  2011, "transport", 500, 55
) |>
  dplyr::group_by(year) |>
  dplyr::mutate(total_activity = sum(activity)) |>
  dplyr::ungroup()

calculate_lmdi(
  data_sectors,
  identity = paste0(
    "emissions:",
    "total_activity*",
    "(activity[sector]/total_activity)*",
    "(emissions[sector]/activity[sector])"
  ),
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 7. Custom factor labels ---
# Replace raw variable names with readable labels for reporting.
# Supply one label per term (target first, then each factor in order).
calculate_lmdi(
  data_simple,
  identity = "emissions:activity*intensity",
  identity_labels = c(
    "Total Emissions",
    "Activity Effect",
    "Intensity Effect"
  ),
  time_var = year,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

# --- 8. Rolling mean smoothing before decomposition ---
# A 3-year rolling mean reduces noise in volatile series before
# computing LMDI weights. Edge years use partial windows (fewer
# than k observations) so no periods are lost.
data_smooth <- tibble::tibble(
  year      = 2010:2020,
  activity  = seq(1000, 2000, length.out = 11),
  intensity = rep(0.1, 11),
  emissions = seq(1000, 2000, length.out = 11) * 0.1
)

calculate_lmdi(
  data_smooth,
  identity = "emissions:activity*intensity",
  time_var = year,
  rolling_mean = 3,
  verbose = FALSE
) |>
  dplyr::select(
    period,
    component_type,
    factor_label,
    additive,
    multiplicative
  )

Calculate manure emissions (CH4 + N2O).

Description

Wrapper that selects Tier 1 or 2 for manure CH4 and computes N2O (Tier 2 only; skipped for Tier 1).

Usage

calculate_manure_emissions(data, tier = NULL)
calculate_manure_emissions(data, tier = NULL)

Arguments

data

Dataframe with species, heads. For Tier 2, also needs cohort, weight, and diet_quality. For Tier 1, iso3 is used to select regional emission factors.

tier

Integer 1 or 2. If NULL (default), auto-selects based on data completeness.

Value

Dataframe with all input columns preserved, plus:

method_manure_ch4: tracking label.
Tier 1: manure_ef_kgch4, manure_ch4_tier1.
Tier 2: volatile_solids, methane_potential, weighted_mcf, manure_ch4_per_head, manure_ch4_tier2.
N2O (both tiers): method_manure_n2o, n_excretion, manure_n2o_direct, manure_n2o_indirect, manure_n2o_total. Tier 1 uses default per-head excretion rates; Tier 2 uses the energy/nitrogen balance.

Examples

tibble::tibble(
  species = "Cattle", heads = 1000, iso3 = "DEU"
) |>
  calculate_manure_emissions(tier = 1)
tibble::tibble(
  species = "Cattle", heads = 1000, iso3 = "DEU"
) |>
  calculate_manure_emissions(tier = 1)

Estimate non-symbiotic biological nitrogen fixation.

Description

Estimates free-living and associative BNF in agricultural soils from a base rate (crop-specific from bnf, or a default) scaled by environmental modifiers for nitrogen, temperature, water, soil organic matter, pH and clay. Each modifier activates only when its driver column is present.

Usage

calculate_nonsymbiotic_bnf(
  x,
  nonsymbiotic_params = list(),
  soil_params = list()
)
calculate_nonsymbiotic_bnf(
  x,
  nonsymbiotic_params = list(),
  soil_params = list()
)

Arguments

x

A tibble with area_ha. Optional nonsymbiotic_base_kg_ha (or item_prod_code to join the crop-specific base rate), plus the nitrogen, temperature, water and soil (som_pct, soil_ph, clay_pct) drivers.

nonsymbiotic_params

Named list overriding nsbnf_default_kg_ha, k_n_synth, k_n_org, t_opt, t_sigma, ai_threshold.

soil_params

Named list overriding k_som, som_ref, ph_opt, ph_sigma, k_clay, clay_ref.

Value

The input tibble with nonsymbiotic_base_kg_ha, the six ⁠f_*_nonsymbiotic⁠ modifiers, f_env_nonsymbiotic and nonsymbiotic_bnf_t.

Examples

calculate_nonsymbiotic_bnf(tibble::tibble(area_ha = 40))
calculate_nonsymbiotic_bnf(tibble::tibble(area_ha = 40))

Partition crop and weed NPP into dry matter, carbon and nitrogen.

Description

Converts crop NPP components (product, residue, root) and weed biomass to nitrogen and carbon using the bio_coefs per-component coefficients and the weed_coefs scalars. Root and weed below-ground nitrogen include rhizodeposits. Root carbon uses root_c_kgdm, which includes root biomass carbon plus rhizodeposit carbon per tonne of root dry matter; use root_mass_c_kgdm in bio_coefs for root tissue carbon alone. The residue-to-soil split is computed only when a residue_soil_dm_t column (from calculate_residue_destinies()) is present.

Usage

calculate_npp_carbon_nitrogen(x)
calculate_npp_carbon_nitrogen(x)

Arguments

x

A tibble with item_prod_code, product_dm_t, residue_dm_t and root_dm_t. Optional weed_ag_dm_t (above-ground weed dry matter; treated as 0 when absent), crop_npp_dm_t (kept when present) and residue_soil_dm_t (enables the soil-residue nitrogen and carbon split).

Value

The input tibble with weed dry matter, and nitrogen (⁠*_n_t⁠) and carbon (⁠*_c_t⁠) for product, residue, root, weeds, crop NPP and total NPP.

Examples

tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40) |>
  calculate_crop_npp() |>
  calculate_npp_carbon_nitrogen()
tibble::tibble(item_prod_code = "15", production_t = 100, area_ha = 40) |>
  calculate_crop_npp() |>
  calculate_npp_carbon_nitrogen()

N soil inputs and Nitrogen Use Efficiency (NUE) for crop

Description

N inputs (deposition, fixation, synthetic fertilizers, urban sources, manure) and N production in Spain from 1860 to the present for the GRAFS model at the provincial level. The crop NUE is defined as the percentage of produced nitrogen relative to the total nitrogen inputs to the soil. Total soil inputs are calculated as: inputs = deposition + fixation + synthetic + manure + urban

Usage

calculate_nue_crops(example = FALSE)
calculate_nue_crops(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble containing nitrogen use efficiency (NUE) for crops. It includes the following columns:

year: Year.
province_name: The Spanish province.
item: The item which was produced, defined in names_biomass_cb.
box: One of the two systems of the GRAFS model: cropland or semi-natural agroecosystems.
nue: Nitrogen Use Efficiency as a percentage (%).

Examples

calculate_nue_crops(example = TRUE)
calculate_nue_crops(example = TRUE)

NUE for Livestock

Description

Calculates Nitrogen Use Efficiency (NUE) for livestock categories (excluding pets).

The livestock NUE is defined as the percentage of nitrogen in livestock products relative to the nitrogen in feed intake: nue = prod_n / feed_n * 100

Additionally, a mass balance is calculated to check the recovery of N in products and excretion relative to feed intake: mass_balance = (prod_n + excretion_n) / feed_n

Usage

calculate_nue_livestock(example = FALSE)
calculate_nue_livestock(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble containing:

year: Year
province_name: Spanish province
livestock_cat: Livestock category
item: Produced item
prod_n: Nitrogen in livestock products (Mg)
feed_n: Nitrogen in feed intake (Mg)
excretion_n: Nitrogen excreted (Mg)
nue: Nitrogen Use Efficiency (%)
mass_balance: Mass balance ratio (%)

Examples

calculate_nue_livestock(example = TRUE)
calculate_nue_livestock(example = TRUE)

Estimate potential net primary production.

Description

Estimates potential net primary production per unit area, used downstream to scale weed biomass on cropland. Several methods are available; the default lpjml reads gross managed-grassland NPP from a finished LPJmL run, while the climate methods compute potential NPP from temperature, water input and actual evapotranspiration.

Usage

calculate_potential_npp(
  x,
  method = c("lpjml", "miami", "nceas", "rosenzweig"),
  lpjml = list()
)
calculate_potential_npp(
  x,
  method = c("lpjml", "miami", "nceas", "rosenzweig"),
  lpjml = list()
)

Arguments

x

A tibble. The climate methods require temp_c (mean annual temperature, degrees C), water_input_mm (precipitation plus irrigation) and aet_mm (actual evapotranspiration). The lpjml method instead needs a spatial key to join the gridded grass NPP (see lpjml).

method

Potential-NPP method, one of "lpjml" (default), "miami", "nceas" or "rosenzweig".

lpjml

A named list of options for the lpjml method (passed to the LPJmL grass reader). Ignored by the climate methods.

Value

The input tibble with npp_potential_dm_t_ha (potential NPP, t DM per hectare) and method_npp_potential (the method used).

Examples

calculate_potential_npp(
  tibble::tibble(temp_c = 15, water_input_mm = 800, aet_mm = 700),
  method = "miami"
)
calculate_potential_npp(
  tibble::tibble(temp_c = 15, water_input_mm = 800, aet_mm = 700),
  method = "miami"
)

Estimate the destinies of crop residues.

Description

Splits crop residue dry matter into three destinies that sum to the total residue: fed to livestock, burned / removed for fuel, and left on the field for soil incorporation.

Usage

calculate_residue_destinies(x, method = c("krausmann_regional", "shares"))
calculate_residue_destinies(x, method = c("krausmann_regional", "shares"))

Arguments

x

A tibble with item_prod_code and residue_dm_t. The krausmann_regional method also needs region_krausmann (for the recovery rate) and region_hanpp (for the feed-use fraction). region_krausmann can use the recovery-table labels or the matching regions_full labels. The shares method needs year.

method

Destiny method: "krausmann_regional" (default, Krausmann recovery x HANPP-regional feed-use fraction) or "shares" (the Spain-specific per-crop-year use/burn shares, flagged to_be_revised).

Value

The input tibble with residue_feed_dm_t, residue_burn_dm_t, residue_soil_dm_t and method_residue_destiny.

Examples

calculate_residue_destinies(
  tibble::tibble(
    item_prod_code = "15", residue_dm_t = 100,
    region_krausmann = "Western Europe", region_hanpp = "Western Europe"
  )
)
calculate_residue_destinies(
  tibble::tibble(
    item_prod_code = "15", residue_dm_t = 100,
    region_krausmann = "Western Europe", region_hanpp = "Western Europe"
  )
)

System NUE

Description

Calculates the NUE for Spain at the provincial level. The system NUE is defined as the percentage of total nitrogen production (total_prod) relative to the sum of all nitrogen inputs (inputs) into the soil system.

Usage

calculate_system_nue(n_soil_inputs = create_n_soil_inputs(), example = FALSE)
calculate_system_nue(n_soil_inputs = create_n_soil_inputs(), example = FALSE)

Arguments

n_soil_inputs

A tibble of nitrogen soil input (deposition, fixation, synthetic, manure, urban). If not provided and example = FALSE, it will be computed from create_n_soil_inputs().

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the following columns:

year: Year
province_name: Spanish province
total_prod: Total nitrogen production (Mg)
inputs: Total nitrogen inputs (Mg)
nue_system: System-level Nitrogen Use Efficiency (%)

Examples

calculate_system_nue(example = TRUE)
calculate_system_nue(example = TRUE)

Calculate uncertainty bounds for livestock emissions.

Description

Applies IPCC uncertainty ranges to emission estimates. Multipliers sourced from uncertainty_ranges table (no hardcoded values).

Usage

calculate_uncertainty_bounds(data)
calculate_uncertainty_bounds(data)

Arguments

data

Dataframe with emission columns from calculate_livestock_emissions().

Value

Dataframe with added ⁠_lower⁠ and ⁠_upper⁠ columns for each emission estimate.

Examples

tibble::tibble(
  species = "Dairy Cattle",
  cohort = "Adult Female",
  heads = 1000,
  weight = 600,
  diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  calculate_livestock_emissions() |>
  calculate_uncertainty_bounds() |>
  dplyr::select(species, cohort, heads,
    enteric_ch4_tier2, enteric_ch4_tier2_lower,
    enteric_ch4_tier2_upper, manure_ch4_tier2,
    manure_ch4_tier2_lower, manure_ch4_tier2_upper)
tibble::tibble(
  species = "Dairy Cattle",
  cohort = "Adult Female",
  heads = 1000,
  weight = 600,
  diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  calculate_livestock_emissions() |>
  calculate_uncertainty_bounds() |>
  dplyr::select(species, cohort, heads,
    enteric_ch4_tier2, enteric_ch4_tier2_lower,
    enteric_ch4_tier2_upper, manure_ch4_tier2,
    manure_ch4_tier2_lower, manure_ch4_tier2_upper)

Estimate symbiotic biological nitrogen fixation by weeds and cover crops.

Description

Estimates symbiotic BNF from leguminous weeds and seeded cover crops. The legume fraction is a weighted average of spontaneous weeds and seeded cover crops. Environmental modifiers activate only when their driver columns are present.

Usage

calculate_weed_bnf(x, symbiotic_params = list())
calculate_weed_bnf(x, symbiotic_params = list())

Arguments

x

A tibble with weed_npp_n_t, land_use, legumes_seeded and seeded_cover_crop_share. Optional legumes_spontaneous overrides the land-use default; the same environmental driver columns as calculate_crop_bnf() are supported.

symbiotic_params

Named list overriding the symbiotic-BNF parameters (see calculate_crop_bnf()).

Value

The input tibble with weed_ndfa_ref, weed_ndfa, weed_leg_share, f_env_weed and weed_bnf_t.

Examples

calculate_weed_bnf(
  tibble::tibble(
    weed_npp_n_t = 10, land_use = "Cropland",
    legumes_seeded = 0, seeded_cover_crop_share = 0
  )
)
calculate_weed_bnf(
  tibble::tibble(
    weed_npp_n_t = 10, land_use = "Cropland",
    legumes_seeded = 0, seeded_cover_crop_share = 0
  )
)

Commodity balance sheet processing fractions

Description

Specifies the product fractions obtained when CBS items are processed, linking processed items to their output CBS categories.

Usage

cb_processing
cb_processing

Format

A tibble where each row corresponds to one processed-item / output-category combination. It contains the following columns:

ProcessedItem: Name of the CBS item being processed (e.g., "Apples and products", "Barley and products").
item_cbs: Name of the output CBS category produced by processing (e.g., "Alcohol, Non-Food").
Product_fraction: Conversion factor from processed input quantity to output product quantity. This can exceed 1 when the output includes added mass, such as water in beverages.
Value_fraction: Economic value fraction associated with the output product (numeric; largely NA in current data).
Required: Marks required co-product links in selected processing chains.

Source

Derived from FAOSTAT commodity balance sheet processing assumptions.

Examples

head(cb_processing)
head(cb_processing)

CBS to trade item code mapping

Description

Maps detailed FAOSTAT trade item codes to their corresponding CBS item categories, enabling aggregation of bilateral trade data into the CBS framework.

Usage

cbs_trade_codes
cbs_trade_codes

Format

A tibble where each row corresponds to one trade item. It contains the following columns:

item_code_trade: Numeric FAOSTAT trade item code (e.g., 15 for wheat).
item_trade: Name of the trade item (e.g., "Wheat", "Flour, wheat", "Bran, wheat").
item_cbs: Name of the CBS category this trade item belongs to (e.g., "Wheat and products").
item_check: Cross-validation column repeating the mapped CBS name; used to flag mapping inconsistencies during data processing.

Source

Derived from FAOSTAT Detailed Trade Matrix and commodity balance sheet correspondence tables.

Examples

head(cbs_trade_codes)
head(cbs_trade_codes)

FAOSTAT crop to LPJmL crop functional type (CFT) mapping

Description

Maps FAOSTAT primary-production item codes to WHEP's granular 33-class crop functional type taxonomy and the coarser LPJmL-compatible parent class. Used by build_gridded_landuse() and run_spatialize() to aggregate spatialized crop-level output into named crop functional types.

Usage

cft_mapping
cft_mapping

Format

A tibble with one row per mapped FAOSTAT item. Columns:

item_prod_code: Integer FAOSTAT item code.
item_prod_name: Human-readable FAOSTAT item name.
cft_name: Granular WHEP CFT name (33 classes, e.g. "temperate_cereals", "coffee", "oil_crops_oilpalm").
cft_lpjml: LPJmL-compatible parent class; one of the 12 LPJmL v6 named crop CFTs or "others".
luh2_type: LUH2 crop functional type (c3ann, c4ann, c3per, or c3nfx).

Source

Adapted from LandInG's crop_types_FAOSTAT_LPJmL_default.csv (Ostberg et al. 2023) with WHEP granular extensions.

Examples

head(cft_mapping)
head(cft_mapping)

Check footprint conservation against direct extensions.

Description

Verify the master input-output accounting identity for a footprint: the environmental pressure embodied across all final demand and traced back to an origin sector should equal the direct extension (the source pressure) of that sector.

The footprint engine zeroes negative coefficients, caps column sums, and drops near-zero-output sectors (the FABIO conventions), so the identity holds only approximately. This check quantifies the discrepancy per origin sector instead of asserting exact equality. Crucially it detects under-tracing (pressure that silently disappears and never reaches final demand), which compute_footprint()'s conserve_extensions bounding never reports because it only rescales results downward.

Only the positive side of the extensions is traced, matching the engine, which traces pmax(extensions, 0) and ignores sectors with output ⁠<= output_tol⁠.

Usage

check_footprint_conservation(
  footprint,
  extensions,
  labels,
  x_vec,
  output_tol = 1e-08,
  tol = 0.01
)
check_footprint_conservation(
  footprint,
  extensions,
  labels,
  x_vec,
  output_tol = 1e-08,
  tol = 0.01
)

Arguments

footprint

Footprint tibble from compute_footprint(), with origin_area, origin_item and value columns.

extensions

Numeric vector of environmental extensions per sector, as passed to compute_footprint().

labels

Tibble with area_code and item_cbs_code mapping each sector to its meaning, as passed to compute_footprint().

x_vec

Numeric vector of total output per sector.

output_tol

Minimum output for a sector to be traceable. Sectors with x_vec <= output_tol contribute zero direct pressure, matching compute_footprint().

tol

Relative tolerance for the conservation status. Discrepancies within tol of the direct pressure are "ok".

Value

A tibble with one row per origin sector:

origin_area: Country where the pressure occurs.
origin_item: Item causing the pressure.
direct: Direct (source) extension for the sector.
embodied: Footprint traced back to the sector.
discrepancy: embodied - direct.
rel_discrepancy: discrepancy / direct (NA when direct is zero).
status: One of "ok", "under_traced", "dropped" (embodied is zero while direct is positive) or "over_traced". Rows are ordered by descending absolute relative discrepancy.

Examples

z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
fp <- compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
)
check_footprint_conservation(fp, extensions, labels, x_vec)
z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
fp <- compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
)
check_footprint_conservation(fp, extensions, labels, x_vec)

Flag implausible year-on-year jumps in a time series.

Description

Scan each series for consecutive observations whose ratio falls outside a plausible band, the level-2 (within-series) detector of the AFE data and code validation framework. It catches unjustified single-year steps, spikes and the level shifts that appear where two sources are spliced, while a break-year allowlist keeps documented regime changes from firing as false positives.

Usage

check_series_jumps(
  data,
  value_col,
  time_col = year,
  .by = NULL,
  ratio_bounds = c(0.55, 1.6),
  bands = NULL,
  min_value = 0,
  consecutive_only = TRUE,
  allowlist = NULL,
  verbose = TRUE
)
check_series_jumps(
  data,
  value_col,
  time_col = year,
  .by = NULL,
  ratio_bounds = c(0.55, 1.6),
  bands = NULL,
  min_value = 0,
  consecutive_only = TRUE,
  allowlist = NULL,
  verbose = TRUE
)

Arguments

data

A data frame with one observation per row.

value_col

The column holding the series values to scan.

time_col

The column holding time values. Default: year.

.by

A character vector of grouping columns identifying each series (optional). When NULL, the whole table is one series.

ratio_bounds

Length-2 numeric c(low, high): the default plausible band for the ratio of consecutive values. A ratio below low or above high is flagged.

bands

Optional data frame of per-group band overrides: the grouping columns (a subset of .by) plus lo and hi. Where a group matches, its lo/hi replace ratio_bounds for that group.

min_value

Minimum value both members of a pair must exceed to be flagged. Keeps genuine near-zero technology onsets from firing. Default: 0.

consecutive_only

Logical. If TRUE (default), only pairs one time step apart are scanned; larger gaps are skipped.

allowlist

Optional data frame of documented break years, matched on the grouping columns plus time_col. Matching flags are returned with allowlisted = TRUE rather than dropped.

verbose

Logical. If TRUE (default), report flag counts with cli.

Details

This is the first landed detector of the reusable ⁠check_*⁠ library described in the AFE data-validation framework decision (afse-wiki/wiki/decisions/afse-data-validation-framework.md, level 2, within-series). It generalises the energy-hist consecutive-year jump scan (a world-total ratio scan against a break-year allowlist) into a grouped, band-parameterised check usable both as an inline pipeline guard and as a test backend.

Two pieces of the framework's metadata model are exposed as arguments. bands supplies per-variable plausible-jump bands (land area is tight, yield is wide, so a single global band is wrong for a mixed panel), and allowlist supplies the historically justified break years that are reported but not treated as defects. Undocumented jumps stay flagged.

A robust variant (flagging via the median absolute deviation of log ratios, per the Hampel/MAD anchors in the framework) is a documented future extension, not implemented here.

Value

A tibble with one row per flagged jump: the grouping columns, the time_col of the later observation, prev_value, value, ratio (value / prev_value) and allowlisted. When nothing is flagged, a zero-row tibble of the same shape and column types.

Examples

series <- tibble::tibble(
  category = rep(c("area", "yield"), each = 5),
  year = rep(2000:2004, times = 2),
  value = c(100, 101, 102, 180, 181, 3.0, 3.1, 5.2, 3.0, 3.1)
)
# area steps 102 -> 180 (1.76x); yield steps 3.1 -> 5.2 (1.68x)
check_series_jumps(series, value, .by = "category")

# Widen the band for yield only, leaving area on the default:
bands <- tibble::tibble(category = "yield", lo = 0.4, hi = 2.5)
check_series_jumps(series, value, .by = "category", bands = bands)
series <- tibble::tibble(
  category = rep(c("area", "yield"), each = 5),
  year = rep(2000:2004, times = 2),
  value = c(100, 101, 102, 180, 181, 3.0, 3.1, 5.2, 3.0, 3.1)
)
# area steps 102 -> 180 (1.76x); yield steps 3.1 -> 5.2 (1.68x)
check_series_jumps(series, value, .by = "category")

# Widen the band for yield only, leaving area on the default:
bands <- tibble::tibble(category = "yield", lo = 0.4, hi = 2.5)
check_series_jumps(series, value, .by = "category", bands = bands)

Check the commodity balance sheet supply-use identity.

Description

Verify that total supply equals total use for every row of a wide commodity balance sheet, the fundamental accounting identity behind the input-output model. Supply is production + import + stock_withdrawal; use is export + food + feed + seed + processing + other_uses + stock_addition.

Usage

check_supply_use_balance(cbs, tol = 1e-06)
check_supply_use_balance(cbs, tol = 1e-06)

Arguments

cbs

Wide commodity balance sheet, e.g. from get_wide_cbs().

tol

Absolute tolerance (in the data's mass units) for a row to count as balanced.

Value

A tibble with the key columns present in cbs plus:

supply: Total supply.
use: Total use.
abs_diff: abs(supply - use).
rel_diff: abs_diff / supply (NA when supply is zero).
balanced: TRUE when abs_diff <= tol. Rows are ordered by descending absolute difference.

Examples

get_wide_cbs(example = TRUE) |>
  check_supply_use_balance()
get_wide_cbs(example = TRUE) |>
  check_supply_use_balance()

Climate-zone MCF values.

Description

Methane Conversion Factors by MMS type and climate zone (Cool/Temperate/Warm).

Usage

climate_mcf
climate_mcf

Format

A tibble with mms_type, climate_zone, mcf_percent.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.17.

Examples

climate_mcf
climate_mcf

Combine independent coefficient-of-variation components.

Description

Aggregate several independent sources of relative uncertainty into a single coefficient of variation by adding them in quadrature. This is how per-indicator data-quality scores (for example the reliability, completeness and representativeness columns of a pedigree matrix) are turned into one CoV for propagate_fp_uncertainty(). Supply the per-indicator CoV values yourself from a sourced, citable pedigree table; no uncertainty factors are hard-coded here.

Usage

combine_cov(...)
combine_cov(...)

Arguments

...

Numeric vectors of equal length, one per uncertainty source, or a single matrix/data frame with one column per source.

Value

A numeric vector of combined coefficients of variation.

Examples

combine_cov(c(0.3, 0.1), c(0.4, 0.2))
combine_cov(c(0.3, 0.1), c(0.4, 0.2))

Compare two footprint estimates.

Description

Align two consumption-based footprint estimates (for example the Leontief and land-balance methods) by consuming country and item and report their difference. Disagreement is diagnostic: it localises where a method's assumptions matter most.

Usage

compare_footprint_methods(method_a, method_b)
compare_footprint_methods(method_a, method_b)

Arguments

method_a, method_b

Tibbles with area_code, item_cbs_code and value (consumption footprint).

Value

A tibble with area_code, item_cbs_code, value_a, value_b, abs_diff and rel_diff (relative to the larger of the two), ordered by descending abs_diff.

Examples

a <- tibble::tibble(area_code = 1L, item_cbs_code = 10L, value = 30)
b <- tibble::tibble(area_code = 1L, item_cbs_code = 10L, value = 25)
compare_footprint_methods(a, b)
a <- tibble::tibble(area_code = 1L, item_cbs_code = 10L, value = 30)
b <- tibble::tibble(area_code = 1L, item_cbs_code = 10L, value = 25)
compare_footprint_methods(a, b)

Compute environmental footprints.

Description

Trace environmental extensions through the supply chain using the Leontief inverse, following the FABIO methodology (Bruckner et al., 2019). The footprint shows how much of an environmental pressure (e.g. land use, water, emissions) is embodied in the final consumption of each product in each country.

The multiplier matrix is computed as $MP_{ij} = (e_i / X_i) \cdot L_{ij}$ , where $e_i$ is the extension for sector $i$ . For each demand category, the footprint is decomposed per target item using the FABIO diagonal approach: $FP = MP \cdot \text{diag}(y)$ , aggregated by item.

For large systems, pass z_mat and x_vec instead of l_inv. This solves $(I - A) x = Y$ directly using a sparse LU factorisation, avoiding the dense Leontief inverse entirely and reducing memory from $O(n^2)$ to $O(nnz)$ .

Usage

compute_footprint(
  l_inv = NULL,
  x_vec,
  y_mat,
  extensions,
  labels,
  z_mat = NULL,
  fd_labels = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  max_column_sum = 100,
  conserve_extensions = TRUE,
  report_conservation = FALSE
)
compute_footprint(
  l_inv = NULL,
  x_vec,
  y_mat,
  extensions,
  labels,
  z_mat = NULL,
  fd_labels = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  max_column_sum = 100,
  conserve_extensions = TRUE,
  report_conservation = FALSE
)

Arguments

l_inv

Leontief inverse matrix from compute_leontief_inverse(). Ignored when z_mat is provided.

x_vec

Numeric vector of total output per sector.

y_mat

Final demand matrix from build_io_model().

extensions

Numeric vector of environmental extensions (e.g. hectares of land use) per sector. Must have the same length as x_vec.

labels

Tibble with area_code and item_cbs_code mapping row/column indices to their meaning. From build_io_model().

z_mat

Optional inter-industry flow matrix from build_io_model(). When provided, the system is solved directly (sparse LU), and l_inv is not needed.

fd_labels

Optional tibble labelling Y columns. Pass fd_labels[[i]] from build_io_model() output. When provided, footprints are decomposed per target item using the FABIO diagonal approach, and the result includes a target_fd column. When omitted, columns of Y are treated as sectors (appropriate only when Y is square).

output_tol

Minimum output considered valid when computing extension intensities. Sectors with x_vec <= output_tol get zero intensity to avoid infinite or numerically explosive footprints from zero-output residuals.

value_added_floor

Minimum share of each sector's output that is treated as non-intermediate leakage when constructing A from z_mat if max_column_sum is left at its low-level default. Ignored when a precomputed l_inv is supplied without z_mat.

max_column_sum

Maximum allowed column sum in A when using z_mat. Physical biomass systems can require more than one unit of intermediate input per unit of output, so the footprint path defaults to 100 and only clips extreme columns caused by residual inconsistencies or tiny outputs.

conserve_extensions

If TRUE, rescale positive footprint flows within each origin area/item so their sum does not exceed the corresponding positive extension total. This keeps footprint outputs conservative when capped coefficients or negative final demand columns would otherwise make positive-only paths larger than the source extension.

report_conservation

If TRUE, emit a message after computing the footprint reporting the conservation gap (the share of the direct extension that is not embodied in final demand), via check_footprint_conservation(). Off by default so the gap is opt-in but never silent when requested.

Value

A tibble with footprint results containing:

origin_area: Country where the pressure occurs.
origin_polity_code: WHEP polity for origin_area.
origin_item: Item causing the pressure.
target_area: Country consuming the product.
target_polity_code: WHEP polity for target_area.
target_item: Item consumed.
target_fd: Demand category (e.g. "food"). Only present when fd_labels is provided.
value: Footprint value in extension units.

Examples

z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
l_inv <- compute_leontief_inverse(z_mat, x_vec)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)

# Small system: pass pre-computed L
compute_footprint(l_inv, x_vec, y_mat, extensions, labels)

# Using Z directly (computes L internally)
compute_footprint(
  x_vec = x_vec, y_mat = y_mat,
  extensions = extensions, labels = labels,
  z_mat = z_mat
)
z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
l_inv <- compute_leontief_inverse(z_mat, x_vec)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)

# Small system: pass pre-computed L
compute_footprint(l_inv, x_vec, y_mat, extensions, labels)

# Using Z directly (computes L internally)
compute_footprint(
  x_vec = x_vec, y_mat = y_mat,
  extensions = extensions, labels = labels,
  z_mat = z_mat
)

Compute land footprints by physical trade balance.

Description

Estimate consumption-based land footprints by propagating direct land use through the bilateral physical trade network, an approach independent of the Leontief inverse used by compute_footprint(). Each country's supply pool (production plus imports) carries an embodied-land intensity s; solving the balance $(D - M) s = L$ per item gives s, where D is diagonal supply throughput, M routes import intensities, and L is direct land. The footprint of consumption (supply minus exports) is then consumption * s.

Unlike the multi-regional input-output method, this tracer handles only the trade dimension, not inter-product processing transformations. Comparing the two with compare_footprint_methods() isolates the effect of that assumption – an apples-to-apples stress test on shared FAOSTAT data.

Usage

compute_footprint_balance(production, trade, extension)
compute_footprint_balance(production, trade, extension)

Arguments

production

Tibble with area_code, item_cbs_code and value (quantity produced).

trade

Tibble with from_code, to_code, item_cbs_code and value (quantity exported from from_code to to_code).

extension

Tibble with area_code, item_cbs_code and value (direct land use of domestic production).

Value

A tibble with area_code (consuming country), item_cbs_code, value (embodied land in consumption) and method ("land_balance").

Examples

production <- tibble::tibble(
  area_code = c(1L, 2L),
  item_cbs_code = c(10L, 10L),
  value = c(100, 0)
)
trade <- tibble::tibble(
  from_code = 1L, to_code = 2L, item_cbs_code = 10L, value = 40
)
extension <- tibble::tibble(
  area_code = c(1L, 2L),
  item_cbs_code = c(10L, 10L),
  value = c(50, 0)
)
compute_footprint_balance(production, trade, extension)
production <- tibble::tibble(
  area_code = c(1L, 2L),
  item_cbs_code = c(10L, 10L),
  value = c(100, 0)
)
trade <- tibble::tibble(
  from_code = 1L, to_code = 2L, item_cbs_code = 10L, value = 40
)
extension <- tibble::tibble(
  area_code = c(1L, 2L),
  item_cbs_code = c(10L, 10L),
  value = c(50, 0)
)
compute_footprint_balance(production, trade, extension)

Compute first-use footprint paths.

Description

Decompose an origin footprint into the first sector that directly uses the origin product before the footprint reaches final demand. This is useful for Sankey views that show paths such as origin product -> first-use area -> first-use product -> final-demand area.

The decomposition uses the IO identity $x = d + A x$ . For each selected origin sector $i$ and final-demand target, the origin requirement $x_i$ is split into direct final demand $d_i$ and direct intermediate use $A_{ij} x_j$ . Values are multiplied by the origin extension intensity $e_i / X_i$ .

Usage

compute_footprint_paths(
  z_mat,
  x_vec,
  y_mat,
  extensions,
  labels,
  fd_labels,
  origin_area = NULL,
  origin_item = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  conserve_extensions = TRUE,
  min_value = 0
)
compute_footprint_paths(
  z_mat,
  x_vec,
  y_mat,
  extensions,
  labels,
  fd_labels,
  origin_area = NULL,
  origin_item = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  conserve_extensions = TRUE,
  min_value = 0
)

Arguments

z_mat

Inter-industry flow matrix from build_io_model().

x_vec

Numeric vector of total output per sector.

y_mat

Final demand matrix from build_io_model().

extensions

Numeric vector of environmental extensions per sector.

labels

Tibble with area_code and item_cbs_code mapping sectors.

fd_labels

Tibble labelling Y columns, from build_io_model().

origin_area

Optional area code vector limiting origin sectors.

origin_item

Optional item code vector limiting origin sectors.

output_tol

Minimum output considered valid when computing extension intensities.

value_added_floor

Minimum non-intermediate leakage share used when constructing technical coefficients from z_mat.

conserve_extensions

If TRUE, rescale positive paths within each origin area/item so their sum does not exceed the corresponding positive extension total.

min_value

Drop paths with values less than or equal to this value before returning.

Value

A tibble with origin_area, origin_item, use_area, use_item, target_area, target_item, role-specific polity metadata, target_fd, path_type, and value.

Compute final-product footprint paths.

Description

Decompose an origin footprint by the area and item of the product supplied to final demand. This adds the missing FABIO-viewer style phase between origin product and final-demand area: origin product -> supplied product area -> supplied product -> final-demand area.

Unlike compute_footprint_paths(), this does not show the first direct intermediate input. It shows the downstream product row in Y whose final demand carries the origin footprint.

Usage

compute_fp_product_paths(
  z_mat,
  x_vec,
  y_mat,
  extensions,
  labels,
  fd_labels,
  origin_area = NULL,
  origin_item = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  conserve_extensions = TRUE,
  min_value = 0
)
compute_fp_product_paths(
  z_mat,
  x_vec,
  y_mat,
  extensions,
  labels,
  fd_labels,
  origin_area = NULL,
  origin_item = NULL,
  output_tol = 1e-08,
  value_added_floor = 0.001,
  conserve_extensions = TRUE,
  min_value = 0
)

Arguments

z_mat

Inter-industry flow matrix from build_io_model().

x_vec

Numeric vector of total output per sector.

y_mat

Final demand matrix from build_io_model().

extensions

Numeric vector of environmental extensions per sector.

labels

Tibble with area_code and item_cbs_code mapping sectors.

fd_labels

Tibble labelling Y columns, from build_io_model().

origin_area

Optional area code vector limiting origin sectors.

origin_item

Optional item code vector limiting origin sectors.

output_tol

Minimum output considered valid when computing extension intensities.

value_added_floor

Minimum non-intermediate leakage share used when constructing technical coefficients from z_mat.

conserve_extensions

If TRUE, rescale positive paths within each origin area/item so their sum does not exceed the corresponding positive extension total.

min_value

Drop paths with values less than or equal to this value before returning.

Value

A tibble with origin_area, origin_item, product_area, product_item, target_area, role-specific polity metadata, target_fd, and value.

Compute Leontief inverse.

Description

Compute the Leontief inverse matrix from intermediate flows and total output. The Leontief inverse captures both direct and indirect requirements across the entire supply chain, enabling footprint tracing.

The technical coefficients matrix is computed as $A_{ij} = Z_{ij} / X_j$ , representing the input of sector $i$ needed per unit of output from sector $j$ . Column sums of A are capped using max_column_sum to avoid singular systems from inconsistent supply-use data. By default this uses 1 - value_added_floor, preserving the previous conservative behavior for explicit Leontief inverses. The Leontief inverse is then $L = (I - A)^{-1}$ .

For large systems (thousands of sectors) this function is not usable: the dense L matrix requires $n^2 \times 8$ bytes of memory (e.g. ~4.8 GiB for n = 25 000). Use compute_footprint() directly with z_mat and x_vec instead, which solves $(I - A) x = Y$ without ever materialising L.

Accepts both dense and sparse (Matrix package) inputs.

Usage

compute_leontief_inverse(
  z_mat,
  x_vec,
  max_n = 5000,
  value_added_floor = 0.001,
  max_column_sum = 1 - value_added_floor
)
compute_leontief_inverse(
  z_mat,
  x_vec,
  max_n = 5000,
  value_added_floor = 0.001,
  max_column_sum = 1 - value_added_floor
)

Arguments

z_mat

Square numeric matrix of inter-industry flows. Entry $Z_{ij}$ is the flow from sector $i$ to sector $j$ . Can be dense or sparse.

x_vec

Numeric vector of total output per sector. Must have the same length as nrow(z_mat).

max_n

Maximum system size before aborting. Defaults to 5000. Set higher at your own risk of memory exhaustion.

value_added_floor

Minimum share of each sector's output that is treated as non-intermediate leakage when constructing A if max_column_sum is left at its default.

max_column_sum

Maximum allowed column sum in A. Columns above this value are rescaled. Defaults to 1 - value_added_floor.

Value

The Leontief inverse matrix $L$ . Negative values are set to zero. Returns a dense matrix.

Examples

z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
compute_leontief_inverse(z_mat, x_vec)
z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
compute_leontief_inverse(z_mat, x_vec)

Consolidate a multi-source panel to one winning row per cell.

Description

Reduce a long panel in which several sources report the same (.by, time_col) cell to a single winning row per cell, chosen by an explicit source-priority ranking with measure-aware demotion, a coverage tie-break, an optional quality tie-break, and a continuity override. It is the general form of the priority-based deduplication used to build the long-term historical energy panel.

Usage

consolidate_sources(
  data,
  value_col,
  source_col,
  priority,
  .by = NULL,
  time_col = year,
  drop_at = 100L,
  measure = NULL,
  tie_break = NULL,
  continuity_override = TRUE,
  verbose = TRUE
)
consolidate_sources(
  data,
  value_col,
  source_col,
  priority,
  .by = NULL,
  time_col = year,
  drop_at = 100L,
  measure = NULL,
  tie_break = NULL,
  continuity_override = TRUE,
  verbose = TRUE
)

Arguments

data

A tibble with one row per source per (.by, time_col) cell.

value_col

Unquoted name of the value column. Coverage counts the cells where this column is non-missing.

source_col

Unquoted name of the source-label column.

priority

Source-to-rank map, as either a named integer vector (c(OWID = 1L, Malanima = 4L)) or a two-column data frame (source, rank). Lower rank wins. Sources absent here take the fallback rank drop_at - 1L.

.by

Character vector of grouping columns that, with time_col, key a cell (for example c("region", "category")). NULL (default) keys cells by time_col alone.

time_col

Unquoted name of the time column. Default: year. Must be numeric; the continuity override treats a difference of one as adjacent.

drop_at

Integer rank at or above which a source is dropped before consolidation. Default: 100L.

measure

Optional named list of measure-demotion options:

basis: data frame flagging measure-mismatched rows. It must contain the source column and may add further key columns present in data (for example a category column) to scope the flag; a data row is flagged when it matches any basis row on all its columns. Default: NULL (no demotion).
penalty: integer added to the effective rank of a flagged, non-exempt row. Default: 1000L (larger than any sensible base rank, so a flagged source falls below every unflagged one while flagged sources keep their relative order).
exempt: one-sided formula selecting rows the penalty never applies to, such as ~ region == "WLD", evaluated on the rows that survive the hard drop. Default: NULL.

tie_break

Optional named list of options breaking equal-rank ties:

coverage: logical, break ties by broader within-series coverage. Default: TRUE.
quality_col: string naming a quality column used as a tie-break after coverage. Default: NULL.
quality_levels: character vector ordering quality_col values best first (unlisted values rank last). Required when quality_col is set.

continuity_override

Logical. Revert isolated single-period winner flips. Default: TRUE.

verbose

Logical. Report the drop count, name-order ties, and continuity reversions. Default: TRUE.

Details

Selection proceeds in four stages.

Hard drop. Every row whose source ranks at or above drop_at is removed before any cell is contested, so a pinned source can never win even an uncontested cell. Sources absent from priority receive the documented fallback rank drop_at - 1L: kept in play but ranked below every source listed with a smaller rank. To exclude an unreliable source, list it at drop_at or above.
Measure-aware demotion. A source can report a different measure than the panel's target concept (production where the panel means consumption, generation shares where it means primary energy, a sector fragment where it means a category total). Rows flagged by measure$basis receive measure$penalty added to their effective rank, so a measure-mismatched source loses any cell a measure-consistent source also reports, yet still wins a cell it alone reports (a lone reporter is never demoted away). Rows matching measure$exempt keep their base rank (for example world-level cells, where production equals consumption).
Winner selection. Within each (.by, time_col) cell any row with a real (non-missing) value outranks every value_col-missing row, so a higher-priority source's NA never discards a lower-priority source's real observation; a cell wins NA only when no source reports a real value. Among rows with a real value the winner is the row of lowest effective rank; ties are broken by broader within-series coverage (the count of cells the source reports across the .by group) when tie_break$coverage, then by tie_break$quality_col ordered per tie_break$quality_levels, then by ascending source name (reported when verbose).
Continuity override. When enabled, an isolated single-period winner flip is reverted: if the immediately preceding and following periods share a different winner that also reports the middle period, that continuous source reclaims the middle cell, removing single-period teeth from otherwise smooth series. The reversion is skipped when the flanking source's middle-period value is itself missing (continuity never reinstates an NA) and when it would hand a cell won by a measure-consistent source back to a measure-demoted one: continuity never undoes the measure penalty, because a single-period source switch is cosmetic while a measure switch corrupts the series.

This operationalises the AFE decision Consolidate multi-source panels measure-consistently (wiki/decisions/measure-consistent-panel-consolidation): measure identity is part of the dedup key's semantics, and priority alone cannot arbitrate cells whose sources report different measures.

The input must hold at most one row per source per cell; pre-aggregate any sub-detail rows first (the function aborts on duplicates rather than sum silently).

Value

A tibble with the winning row per (.by, time_col) cell, the original columns of data, and four added provenance columns: n_sources (distinct sources contesting the cell after the hard drop), source_rank (the winner's base priority rank), effective_rank (base rank plus any measure penalty applied), and measure_demoted (whether the winner carried the measure penalty; a flagged source only wins a cell that no measure-consistent source reports). Rows are ordered by .by then time_col.

Examples

panel <- tibble::tribble(
  ~year, ~region, ~category, ~source, ~value,
  1900, "WLD", "Coal", "OWID", 10,
  1900, "WLD", "Coal", "Malanima", 20,
  1901, "WLD", "Coal", "Malanima", 21,
  1902, "WLD", "Coal", "Malanima", 22
)

consolidate_sources(
  panel,
  value_col = value,
  source_col = source,
  priority = c(OWID = 1L, Malanima = 4L),
  .by = c("region", "category"),
  verbose = FALSE
)
panel <- tibble::tribble(
  ~year, ~region, ~category, ~source, ~value,
  1900, "WLD", "Coal", "OWID", 10,
  1900, "WLD", "Coal", "Malanima", 20,
  1901, "WLD", "Coal", "Malanima", 21,
  1902, "WLD", "Coal", "Malanima", 22
)

consolidate_sources(
  panel,
  value_col = value,
  source_col = source,
  priority = c(OWID = 1L, Malanima = 4L),
  .by = c("region", "category"),
  verbose = FALSE
)

Bouwman feed conversion ratios.

Description

Feed conversion (kilograms dry matter feed per kilogram product) and per feed type composition for the main production species, from Bouwman et al. (2005). Migrated from the afsetools Codes_coefs.xlsx workbook.

Usage

conv_bouwman
conv_bouwman

Format

A tibble with one row per species, feed type, region and anchor year:

item_bouwman: Bouwman species label (Beef cattle, Dairy cattle, Pigs, Poultry, Sheep and goats).
feed_type: Feed type: animals, crops, grass, residues or scavenging.
year: Anchor year (1970, 1995 or 2030).
region_bouwman: Bouwman seventeen region label.
conversion: Per feed type conversion factor (kg DM feed per kg product). Sum across feed types is the total feed conversion ratio; normalised across feed types it gives the default feed composition shares.

Source

Bouwman et al. (2005), via afsetools Codes_coefs.xlsx.

Krausmann per head feed intake.

Description

Annual feed intake (kilograms dry matter per head per year) for draft and non productive species that lack a product based feed conversion ratio, from Krausmann et al. (2013). Migrated from the afsetools Codes_coefs.xlsx workbook.

Usage

conv_krausmann
conv_krausmann

Format

A tibble with one row per species:

item_cbs_code: FAOSTAT commodity balance item code.
species: Species name (Horses, Asses, Mules, Camels and so on).
conversion: Feed intake (kg DM per head per year).

Source

Krausmann et al. (2013), via afsetools Codes_coefs.xlsx.

GRAFS Nitrogen (N) flows at Spain national level

Description

Provides N flows of the Spanish agro-food system on a national level between 1860 and 2020. This dataset is the national equivalent of the provincial GRAFS model and represents Spain as a single system without internal trade between provinces. All production, consumption and soil inputs are aggregated nationally before calculating trade with the outside.

Usage

create_n_nat_destiny(example = FALSE)
create_n_nat_destiny(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A final tibble containing national N flow data by origin and destiny. It includes the following columns:

year: The year in which the recorded event occurred.
item: The item which was produced, defined in names_biomass_cb.
irrig_cat: Irrigation form (irrigated or rainfed)
box: One of the GRAFS model systems: cropland, Semi-natural agroecosystems, Livestock, Fish, or Agro-industry.
origin: The origin category of N: Cropland, Semi-natural agroecosystems, Livestock, Fish, Agro-industry, Deposition, Fixation, Synthetic, People (waste water), Livestock (manure).
destiny: The destiny category of N: population_food, population_other_uses, livestock_mono, livestock_rum (feed), export, Cropland (for N soil inputs).
mg_n: Nitrogen amount in megagrams (Mg).
province_name: Set to "Spain" for all national-level rows.

Examples

create_n_nat_destiny(example = TRUE)
create_n_nat_destiny(example = TRUE)

N production for Spain

Description

Calculates N production at the provincial level in Spain. Production is derived from consumption, export, import, and other uses.

Usage

create_n_production(example = FALSE)
create_n_production(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble containing:

year: Year
province_name: Spanish province
item: Product item
box: Ecosystem box
prod: Produced N (Mg)

Examples

create_n_production(example = TRUE)
create_n_production(example = TRUE)

GRAFS Nitrogen (N) flows

Description

Provides N flows of the spanish agro-food system on a provincial level between 1860 and 2020. This dataset is the the base of the GRAFS model and contains data in megagrams of N (MgN) for each year, province, item, origin and destiny. Thereby, the origin column represents where N comes from, which includes N soil inputs, imports and production. The destiny column shows where N goes to, which includes export, population food, population other uses and feed or cropland (in case of N soil inputs). Processed items, residues, woody crops, grazed weeds are taken into account.

Usage

create_n_prov_destiny(example = FALSE)
create_n_prov_destiny(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A final tibble containing N flow data by origin and destiny. It includes the following columns:

year: The year in which the recorded event occurred.
province_name: The Spanish province where the data is from.
item: The item which was produced, defined in names_biomass_cb.
irrig_cat: Irrigation form (irrigated or rainfed)
box: One of the GRAFS model systems: cropland, Semi-natural agroecosystems, Livestock, Fish, or Agro-industry.
origin: The origin category of N: Cropland, Semi-natural agroecosystems, Livestock, Fish, Agro-industry, Deposition, Fixation, Synthetic, People (waste water), Livestock (manure).
destiny: The destiny category of N: population_food, population_other_uses, livestock_mono, livestock_rum (feed), export, Cropland (for N soil inputs).
mg_n: Nitrogen amount in megagrams (Mg).

Examples

create_n_prov_destiny(example = TRUE)
create_n_prov_destiny(example = TRUE)

Nitrogen (N) soil inputs for Spain

Description

Calculates total nitrogen inputs to soils in Spain at the provincial level. This includes contributions from:

Atmospheric deposition (deposition)
Biological nitrogen fixation (fixation)
Synthetic fertilizers (synthetic)
Manure (excreta, solid, liquid) (manure)
Urban sources (urban)

Special land use categories and items are aggregated:

Semi-natural agroecosystems (e.g., Dehesa, Pasture_Shrubland)
Firewood biomass (e.g., Conifers, Holm oak)

Usage

create_n_soil_inputs(example = FALSE)
create_n_soil_inputs(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble containing:

year: Year
province_name: Spanish province
item: Crop, land use, or biomass item
irrig_cat: Irrigation form (irrigated or rainfed)
box: Land use or ecosystem box for aggregation
deposition: N input from atmospheric deposition (Mg)
fixation: N input from biological N fixation (Mg)
synthetic: N input from synthetic fertilizers (Mg)
manure: N input from livestock manure (Mg)
urban: N input from urban sources (Mg)

Examples

create_n_soil_inputs(example = TRUE)
create_n_soil_inputs(example = TRUE)

Eurostat crop classification codes

Description

Maps Eurostat crop codes to their full crop category names, used when integrating Eurostat agricultural statistics.

Usage

crops_eurostat
crops_eurostat

Format

A tibble where each row corresponds to one Eurostat crop category. It contains the following columns:

Crop: Eurostat crop code (e.g., "G0000", "G1000").
Name_Eurostat: Full name of the crop category as used in Eurostat (e.g., "Plants harvested green from arable land", "Temporary grasses and grazings").

Source

Eurostat Agricultural Statistics.

Examples

head(crops_eurostat)
head(crops_eurostat)

Manure nitrogen application by crop and country

Description

Country- and crop-level estimates of manure nitrogen applied to cropland, from West et al. (2014). Used as a reference for spatializing manure N inputs in the WHEP pipeline.

Usage

crops_manure_n
crops_manure_n

Format

A tibble with one row per crop-country combination containing:

Crop_name: Crop name (character).
ISO: ISO 3166-1 alpha-3 country code.
Continent: Three-letter continent code (e.g. "AFR", "ASI").
Manure_N_Mg: Manure nitrogen applied in megagrams (Mg).

Source

West, P. C. et al. (2014). Leverage points for improving global food security and the environment. Science, 345(6194), 325–328. doi:10.1126/science.1246067

Examples

head(crops_manure_n)
head(crops_manure_n)

Decompose a weighted aggregate ratio.

Description

Separates a change in an aggregate numerator-to-denominator ratio into a between-group composition effect and a within-group ratio effect. The default symmetric Kitagawa allocation is equivalent to a two-factor Shapley value for the weight and within-group-ratio factor blocks.

Usage

decompose_weighted_ratio(
  data,
  time,
  .by,
  ratio,
  method = c("kitagawa", "lmdi", "weights_first", "ratios_first", "all")
)
decompose_weighted_ratio(
  data,
  time,
  .by,
  ratio,
  method = c("kitagawa", "lmdi", "weights_first", "ratios_first", "all")
)

Arguments

data

A tibble with exactly two periods and one row per group-period.

time

An unquoted numeric, date-time, or ordered-factor column identifying the two ordered periods.

.by

An unquoted column identifying persistent groups.

ratio

An unquoted expression of the exact form numerator / denominator, using two bare numeric column names.

method

Decomposition method. One of "kitagawa", "lmdi", "weights_first", "ratios_first", or "all".

Details

For group g and endpoint t, the aggregate ratio is R_t = sum_g(w_gt * r_gt), where w_gt is the group's share of the total denominator and r_gt is its numerator-to-denominator ratio.

Write dw_g = w_g1 - w_g0 and dr_g = r_g1 - r_g0. The symmetric Kitagawa contributions are

$B_g = dw_g (r_g0 + r_g1) / 2,$

$W_g = (w_g0 + w_g1) dr_g / 2.$

The weights-first polar contributions are ⁠B_g = dw_g r_g0⁠ and ⁠W_g = w_g1 dr_g⁠; the ratios-first polar contributions are ⁠B_g = dw_g r_g1⁠ and ⁠W_g = w_g0 dr_g⁠. Kitagawa is their arithmetic mean. Its Shapley interpretation concerns the complete weight and ratio factor blocks; groups are not treated as Shapley players.

"lmdi" implements additive LMDI-I. With ⁠y_gt = w_gt r_gt⁠ and the logarithmic mean L(a, b) = (a - b) / (log(a) - log(b)), using L(a, a) = a, its contributions are

$B_g = L(y_g1, y_g0) log(w_g1 / w_g0),$

$W_g = L(y_g1, y_g0) log(r_g1 / r_g0).$

Every method satisfies B_g + W_g = y_g1 - y_g0 and therefore sum_g(B_g + W_g) = R_1 - R_0, up to floating-point error. "all" returns every method on their common strictly positive domain.

The function requires exactly two periods and identical, unique group support. Denominators must be positive. Numerators may be zero for the arithmetic methods but must be positive for LMDI. Groups are never silently dropped and zeros are never replaced by an epsilon.

Reversing endpoints negates Kitagawa and LMDI-I contributions. Reversing a polar path negates the opposite forward polar path. Percentage contributions are signed and are not clipped to zero or 100. They are missing when abs(R_1 - R_0) is no larger than sqrt(.Machine$double.eps) * max(abs(R_0), abs(R_1)). High cancellation_index values indicate cancellation between the aggregate between and within effects. It is defined as 1 - abs(total_change) / (abs(between) + abs(within)), with zero used when every effect is zero. Results are descriptive accounting identities, not causal attribution, and depend on the chosen group resolution.

Value

A tibble with one component_type = "group" row per group and one component_type = "summary" row for each requested method. Group rows contain endpoint stocks, weights, ratios, signed contributions, and group closure. Summary rows contain aggregate endpoint ratios, signed effects, percentage contributions, closure diagnostics, and cancellation metrics. Additive signed contributions have the same units as ratio.

References

Kitagawa, E. M. (1955). Components of a Difference Between Two Rates. Journal of the American Statistical Association, 50(272), 1168-1194. doi:10.1080/01621459.1955.10501299.

Ang, B. W. (2005). The LMDI approach to decomposition analysis: a practical guide. Energy Policy, 33(7), 867-871. doi:10.1016/j.enpol.2003.10.010.

Examples

ratio_data <- tibble::tribble(
  ~year, ~group, ~numerator, ~denominator,
  2000, "a", 400, 40,
  2000, "b", 1200, 60,
  2020, "a", 600, 50,
  2020, "b", 900, 50
)

decompose_weighted_ratio(
  ratio_data,
  year,
  group,
  numerator / denominator
)
ratio_data <- tibble::tribble(
  ~year, ~group, ~numerator, ~denominator,
  2000, "a", 400, 40,
  2000, "b", 1200, 60,
  2020, "a", 600, 50,
  2020, "b", 900, 50
)

decompose_weighted_ratio(
  ratio_data,
  year,
  group,
  numerator / denominator
)

Estimate energy demand (Gross Energy) - Tier 2

Description

Calculate gross energy (GE) intake per IPCC 2019 Tier 2 equations (Vol 4, Ch 10). Estimates net energy components for maintenance, activity, lactation, work, pregnancy, growth, and wool, then derives total gross energy using the REM/REG ratio approach from IPCC Eq 10.16.

All coefficients come from internal package data.

Usage

estimate_energy_demand(data, method = "ipcc2019")
estimate_energy_demand(data, method = "ipcc2019")

Arguments

data

A dataframe with columns species, cohort, heads, and optionally iso3. Optional production columns: weight, milk_yield_kg_day, fat_percent, weight_gain_kg_day, work_hours_day, pregnant_fraction, temperature_c, diet_quality, grazing_distance_km, system.

method

Method for calculation (default "ipcc2019").

Value

Dataframe with added gross_energy (MJ/day), intermediate net energy components, and method_energy tracking column.

Examples

tibble::tibble(
  species = "Dairy Cattle", cohort = "Adult Female",
  heads = 100, weight = 600, diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  estimate_energy_demand() |>
  dplyr::select(species, cohort, heads, ne_maintenance,
    ne_activity, ne_lactation, ne_growth, gross_energy)
tibble::tibble(
  species = "Dairy Cattle", cohort = "Adult Female",
  heads = 100, weight = 600, diet_quality = "High",
  milk_yield_kg_day = 20
) |>
  estimate_energy_demand() |>
  dplyr::select(species, cohort, heads, ne_maintenance,
    ne_activity, ne_lactation, ne_growth, gross_energy)

Estimate livestock nitrogen, carbon and volatile-solids excretion.

Description

Converts realised feed intake (the output of redistribute_feed()) into excreted nitrogen, carbon and volatile solids per ⁠year x territory x sub_territory x livestock_category⁠. All excretion methods share one canonical nitrogen intake, n_intake = sum(intake_dm_t * feed_n_content), so the methods are directly comparable.

Usage

estimate_n_excretion(intake, options = list())
estimate_n_excretion(intake, options = list())

Arguments

intake

A tibble of realised feed intake with at least year, territory, sub_territory, livestock_category, item_cbs_code, feed_quality and intake_dm_t (the redistribute_feed() result).

options

A named list of method options:

method: "intake_minus_retention" (default, n_intake * (1 - n_retention_frac)) or "intake_minus_product_n" (n_intake - product_n).
method_vs: "intake_digestibility" (default, intake_dm_t * (1 - digestibility) * (1 - ash)).
product_n: a tibble (year, territory, sub_territory, livestock_category, product_n) required by "intake_minus_product_n".

Value

A tibble with one row per ⁠year x territory x sub_territory x livestock_category⁠ and columns n_intake, n_excretion, c_excretion, vs_excretion, method_n_excretion and method_vs.

Examples

intake <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~item_cbs_code, ~feed_quality, ~intake_dm_t,
  2020L, "ES", NA, "Cattle_milk", 2513L, "high_quality", 100,
  2020L, "ES", NA, "Cattle_milk", NA, "grass", 500
)
estimate_n_excretion(intake)
intake <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~item_cbs_code, ~feed_quality, ~intake_dm_t,
  2020L, "ES", NA, "Cattle_milk", 2513L, "high_quality", 100,
  2020L, "ES", NA, "Cattle_milk", NA, "grass", 500
)
estimate_n_excretion(intake)

Trade data sources

Description

Create a new dataframe where each row has a year range into one where each row is a single year, effectively 'expanding' the whole year range.

Usage

expand_trade_sources(trade_sources)
expand_trade_sources(trade_sources)

Arguments

trade_sources

A tibble dataframe where each row contains the year range.

Value

A tibble dataframe where each row corresponds to a single year for a given source.

Examples

trade_sources <- tibble::tibble(
  Name = c("a", "b", "c"),
  Trade = c("t1", "t2", "t3"),
  Info_Format = c("year", "partial_series", "year"),
  Timeline_Start = c(1, 1, 2),
  Timeline_End = c(3, 4, 5),
  Timeline_Freq = c(1, 1, 2),
  `Imp/Exp` = "Imp",
  SACO_link = NA,
)
expand_trade_sources(trade_sources)
trade_sources <- tibble::tibble(
  Name = c("a", "b", "c"),
  Trade = c("t1", "t2", "t3"),
  Info_Format = c("year", "partial_series", "year"),
  Timeline_Start = c(1, 1, 2),
  Timeline_End = c(3, 4, 5),
  Timeline_Freq = c(1, 1, 2),
  `Imp/Exp` = "Imp",
  SACO_link = NA,
)
expand_trade_sources(trade_sources)

Feed characteristics by diet quality.

Description

DE%, NDF%, GE content, and crude protein percentage for High/Medium/Low diet quality levels.

Usage

feed_characteristics
feed_characteristics

Format

A tibble with diet_quality, de_percent, ndf_percent, ge_content_mj_kg_dm, cp_percent.

Source

IPCC 2019, Vol 4, Ch 10.

Examples

feed_characteristics
feed_characteristics

Feed taxonomy.

Description

Maps each feed item to its feed group, feed quality class, per consumer feed type labels and an allocation priority. Granivores get a restricted feed type set, so only grazers take fibrous roughage. Migrated from the afsetools Codes_coefs.xlsx workbook.

Usage

feed_taxonomy
feed_taxonomy

Format

A tibble with one row per feed item:

item_cbs_code: FAOSTAT commodity balance item code.
item_cbs: Item name.
feed_group: Feed group (crop or material class).
feed_quality: Feed quality class: lactation, high_quality, low_quality, residues, grass, zoot_fixed or non_feed.
feed_quality_rank: Allocation priority rank (lower is allocated first): lactation and high_quality 1, low_quality 2, residues 3, grass 4.
granivore_feedtype: Feed type the item provides to granivores.
grazer_feedtype: Feed type the item provides to grazers.
zoot_fixed: Whether intake equals demand regardless of supply (compound feed ingredients that are not substitutable).

Source

afsetools Codes_coefs.xlsx.

Fill gaps by linear interpolation, or carrying forward or backward.

Description

Fills gaps (NA values) in a time-dependent variable by linear interpolation between two points, or carrying forward or backwards the last or initial values, respectively. It also creates a new variable indicating the source of the filled values.

Usage

fill_linear(
  data,
  value_col,
  time_col = year,
  interpolate = TRUE,
  log_space = FALSE,
  fill_forward = TRUE,
  fill_backward = TRUE,
  value_smooth_window = NULL,
  .by = NULL,
  .copy = TRUE
)
fill_linear(
  data,
  value_col,
  time_col = year,
  interpolate = TRUE,
  log_space = FALSE,
  fill_forward = TRUE,
  fill_backward = TRUE,
  value_smooth_window = NULL,
  .by = NULL,
  .copy = TRUE
)

Arguments

data

A data frame containing one observation per row.

value_col

The column containing gaps to be filled.

time_col

The column containing time values. Default: year.

interpolate

Logical. If TRUE (default), performs linear interpolation.

log_space

Logical. If TRUE, interior interpolation is performed in log space (constant compound growth rate) for each gap segment whose two bracketing anchors are both finite and strictly positive; any segment with a non-positive or non-finite anchor falls back to linear interpolation. Default: FALSE, i.e. linear interpolation everywhere. Log-space fills are labelled "Log-linear interpolation" in the source column, distinct from "Linear interpolation", so the choice survives downstream. Carrying forward or backward is unaffected.

fill_forward

Logical. If TRUE (default), carries last value forward.

fill_backward

Logical. If TRUE (default), carries first value backward.

value_smooth_window

An integer specifying the window size for a centered moving average applied to the variable before gap-filling. Useful for variables with high inter-annual variability. If NULL (default), no smoothing is applied.

.by

A character vector with the grouping variables (optional).

.copy

Logical. If TRUE (default), data.table inputs are defensively copied before mutation. Set to FALSE when the caller owns the data and does not need the original preserved.

Value

A tibble data frame (ungrouped) where gaps in value_col have been filled, and a new "source" variable has been created indicating if the value is original or, in case it has been estimated, the gapfilling method that has been used.

Examples

sample_tibble <- tibble::tibble(
  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),
  year = c(
    "2015", "2016", "2017", "2018", "2019", "2020",
    "2015", "2016", "2017", "2018", "2019", "2020"
  ),
  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),
)
fill_linear(sample_tibble, value, .by = c("category"))
fill_linear(
  sample_tibble,
  value,
  interpolate = FALSE,
  .by = c("category"),
)
fill_linear(sample_tibble, value, log_space = TRUE, .by = c("category"))
sample_tibble <- tibble::tibble(
  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),
  year = c(
    "2015", "2016", "2017", "2018", "2019", "2020",
    "2015", "2016", "2017", "2018", "2019", "2020"
  ),
  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),
)
fill_linear(sample_tibble, value, .by = c("category"))
fill_linear(
  sample_tibble,
  value,
  interpolate = FALSE,
  .by = c("category"),
)
fill_linear(sample_tibble, value, log_space = TRUE, .by = c("category"))

Fill gaps using growth rates from proxy variables

Description

Fills missing values using growth rates from a proxy variable (reference series). Supports regional aggregations, weighting, and linear interpolation for small gaps.

Usage

fill_proxy_growth(
  data,
  value_col,
  proxy_col,
  time_col = year,
  .by = NULL,
  max_gap = Inf,
  max_gap_linear = 3,
  fill_scope = NULL,
  value_smooth_window = NULL,
  proxy_smooth_window = 1,
  output_format = "clean",
  verbose = TRUE
)
fill_proxy_growth(
  data,
  value_col,
  proxy_col,
  time_col = year,
  .by = NULL,
  max_gap = Inf,
  max_gap_linear = 3,
  fill_scope = NULL,
  value_smooth_window = NULL,
  proxy_smooth_window = 1,
  output_format = "clean",
  verbose = TRUE
)

Arguments

data

A data frame containing time series data.

value_col

The column containing values to fill.

proxy_col

Character or vector. Proxy variable(s) for calculating growth rates. Supports multiple syntax formats:

Simple numeric proxy (e.g., "population"): Auto-detects numeric columns and uses them as proxy variable. Inherits the .by parameter to compute proxy values per group.
Simple categorical proxy (e.g., "region"): Auto-detects categorical columns and interprets as value_col:region. Aggregates value_col by the specified groups.
Advanced syntax (e.g., "gdp:region"): Format is "variable:group1+group2". Aggregates variable by specified groups.
Hierarchical fallback (e.g., c("population", "gdp:region")): Tries first proxy, falls back to second if first fails.
Weighted aggregation (e.g., "gdp[population]"): Weight variable by specified column during aggregation.

time_col

The column containing time values. Default: year.

.by

A character vector with the grouping variables (optional).

max_gap

Numeric. Maximum gap size to fill using growth method. Default: Inf.

max_gap_linear

Numeric. Maximum gap size for linear interpolation fallback. Default: 3.

fill_scope

Quosure. Filter expression to limit filling scope. Default: NULL.

value_smooth_window

Integer. Window size for a centered moving average applied to the value column before gap-filling. Useful for variables with high inter-annual variability. If NULL (default), no smoothing is applied.

proxy_smooth_window

Integer. Window size for moving average smoothing of proxy reference values before computing growth rates. Default: 1.

output_format

Character. Output format: "clean" or "detailed". Default: "clean".

verbose

Logical. Print progress messages. Default: TRUE.

Details

Combined Growth Sequence (Hierarchical Interpolation):

When using multiple proxies with hierarchical fallback, the function implements an intelligent combined growth sequence strategy:

Better proxies (earlier in hierarchy) are tried first for each gap.
If a better proxy has partial coverage within a gap, those growth rates are used for the covered positions.
Fallback proxies fill only the remaining positions where better proxies are not available.
Values filled by better proxies are protected from being overwritten.

Value

A data frame with filled values. If output_format = "clean", returns original columns with updated value_col and added source column. If "detailed", includes all intermediate columns.

Examples

# Fill GDP using population as proxy
data <- tibble::tibble(
  country = rep("ESP", 4),
  year = 2010:2013,
  gdp = c(1000, NA, NA, 1200),
  population = c(46, 46.5, 47, 47.5)
)

fill_proxy_growth(
  data,
  value_col = gdp,
  proxy_col = "population",
  .by = "country"
)

# Fill GDP using population as proxy
data <- tibble::tibble(
  country = rep("ESP", 4),
  year = 2010:2013,
  gdp = c(1000, NA, NA, 1200),
  population = c(46, 46.5, 47, 47.5)
)

fill_proxy_growth(
  data,
  value_col = gdp,
  proxy_col = "population",
  .by = "country"
)

Fill gaps summing the previous value of a variable to the value of another variable.

Description

Fills gaps in a variable with the sum of its previous value and the value of another variable. When a gap has multiple observations, the values are accumulated along the series. When there is a gap at the start of the series, it can either remain unfilled or assume an invisible 0 value before the first observation and start filling with cumulative sum.

Usage

fill_sum(
  data,
  value_col,
  change_col,
  time_col = year,
  start_with_zero = TRUE,
  .by = NULL
)
fill_sum(
  data,
  value_col,
  change_col,
  time_col = year,
  start_with_zero = TRUE,
  .by = NULL
)

Arguments

data

A data frame containing one observation per row.

value_col

The column containing gaps to be filled.

change_col

The column whose values will be used to fill the gaps.

time_col

The column containing time values. Default: year.

start_with_zero

Logical. If TRUE (default), assumes an invisible 0 value before the first observation and fills with cumulative sum starting from the first change_col value. If FALSE, starting NA values remain unfilled.

.by

A character vector with the grouping variables (optional).

Value

A tibble dataframe (ungrouped) where gaps in value_col have been filled, and a new "source" variable has been created indicating if the value is original or, in case it has been estimated, the gapfilling method that has been used.

Examples

sample_tibble <- tibble::tibble(
  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),
  year = c(
    "2015", "2016", "2017", "2018", "2019", "2020",
    "2015", "2016", "2017", "2018", "2019", "2020"
  ),
  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),
  change_variable = c(1, 2, 3, 4, 1, 1, 0, 0, 0, 0, 0, 1)
)
fill_sum(
  sample_tibble,
  value,
  change_variable,
  start_with_zero = FALSE,
  .by = c("category")
)
fill_sum(
  sample_tibble,
  value,
  change_variable,
  start_with_zero = TRUE,
  .by = c("category")
)
sample_tibble <- tibble::tibble(
  category = c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b", "b", "b"),
  year = c(
    "2015", "2016", "2017", "2018", "2019", "2020",
    "2015", "2016", "2017", "2018", "2019", "2020"
  ),
  value = c(NA, 3, NA, NA, 0, NA, 1, NA, NA, NA, 5, NA),
  change_variable = c(1, 2, 3, 4, 1, 1, 0, 0, 0, 0, 0, 1)
)
fill_sum(
  sample_tibble,
  value,
  change_variable,
  start_with_zero = FALSE,
  .by = c("category")
)
fill_sum(
  sample_tibble,
  value,
  change_variable,
  start_with_zero = TRUE,
  .by = c("category")
)

Describe the scope of a footprint result.

Description

Build a machine-readable scope record for a footprint: what is measured, in which units, by which method, and under which system boundary, allocation rule, data vintage and known limitations. This is the ISO 14044 goal-and-scope made into an attachable data object rather than prose, so a footprint result always travels with the assumptions behind it.

Attach it to a result with attach_scope() and read it back with get_scope().

Usage

footprint_scope(stressor, units, method, details = list())
footprint_scope(stressor, units, method, details = list())

Arguments

stressor

What is measured, e.g. "cropland".

units

Units of the footprint value, e.g. "ha".

method

Estimation method used, e.g. "FABIO-MRIO". This mirrors the multi-method ⁠method_<quantity>⁠ columns recorded elsewhere in the package.

details

Optional named list overriding any of: boundary (system boundary), allocation (allocation rule), vintage (data years) and limitations (free text).

Value

A one-row tibble with columns stressor, units, method, boundary, allocation, vintage and limitations.

Examples

footprint_scope(
  stressor = "cropland",
  units = "ha",
  method = "FABIO-MRIO",
  details = list(vintage = "1850-2023", allocation = "mass")
)
footprint_scope(
  stressor = "cropland",
  units = "ha",
  method = "FABIO-MRIO",
  details = list(vintage = "1850-2023", allocation = "mass")
)

Local sensitivity of a footprint to each extension.

Description

One-at-a-time sensitivity analysis: nudge each sector's extension by a small relative step and measure the elasticity of the total footprint. High-elasticity sectors are where data quality matters most and where a result is most fragile.

Usage

footprint_sensitivity(run_fn, extensions, options = list())
footprint_sensitivity(run_fn, extensions, options = list())

Arguments

run_fn

Function taking an extension vector and returning a footprint tibble with a value column.

extensions

Numeric vector of base extensions per sector.

options

Named list overriding delta (relative step, default 0.05) and which (sector indices to test; default all non-zero extensions).

Value

A tibble with sector (index into extensions) and elasticity, ordered by descending absolute elasticity.

Examples

run_fn <- function(ext) tibble::tibble(value = sum(ext))
footprint_sensitivity(run_fn, extensions = c(60, 40))
run_fn <- function(ext) tibble::tibble(value = sum(ext))
footprint_sensitivity(run_fn, extensions = c(60, 40))

Physical arable and permanent-crop land base (fallow-inclusive).

Description

Return FAO's physical land-use split of cropland into arable land (annual/temporary crops plus their rotational fallow and temporary meadows) and permanent-crop land (orchards, plantations, vineyards), keyed by ⁠(area_code, year)⁠.

whep's other crop-area paths (get_crop_land_extension(), build_cropgrids_land_extension()) are all derived from crop production / harvested area and therefore cannot recover the physical fallow-inclusive arable land of rain-fed, fallow-prone economies: in a drought year a country's cereal harvest collapses while its arable land (which counts the resting fallow) is unchanged, so a harvested-area method assigns that land to perennials and over-states the permanent share (e.g. Tunisia 2020 permanent share 0.73 from harvested area vs 0.43 physical). FAO's RL land-use survey (Cropland = ⁠Arable land⁠ + ⁠Permanent crops⁠) is the physical land base; this function ingests it.

From 1961 the split is FAO's own (source == "fao"). Before 1961 (FAOSTAT's start) it is backcast from LUH2 land use: LUH2's annual vs. perennial crop functional types give a perennial fraction and a cropland shape that are spliced onto the FAO 1961 level so the series is continuous (source == "luh2"). See Details.

Usage

get_arable_permanent_land(
  years = NULL,
  input_dir = NULL,
  data = NULL,
  luh2_data = NULL,
  example = FALSE
)
get_arable_permanent_land(
  years = NULL,
  input_dir = NULL,
  data = NULL,
  luh2_data = NULL,
  example = FALSE
)

Arguments

years

Integer vector of years to return, or NULL (default) for all available (1700-2025). The pre-1961 LUH2 backcast is computed only when years is NULL or requests a year before 1961.

input_dir

Optional directory holding a local FAOSTAT RL land-use file (faostat_land_use.csv or a parquet with the FAOSTAT RL columns). If NULL (default) the pinned faostat-landuse dataset is read via whep_read_file().

data

Optional in-memory FAOSTAT RL table in the raw pin schema (columns ⁠Area Code⁠, ⁠Item Code⁠, Element, Unit, Year, Value), used instead of the pin (chiefly for testing).

luh2_data

Optional in-memory LUH2 land-use table (columns ISO3, Year, Land_Use, Area_Mha) used for the pre-1961 backcast instead of the pinned luh2-areas dataset (chiefly for testing).

example

If TRUE, return a small illustrative table without reading remote data. Defaults to FALSE.

Details

The FAO identity ⁠Cropland = Arable land + Permanent crops⁠ holds in the source to rounding for essentially all country-years; permanent_ha is taken as ⁠Cropland - Arable land⁠ (clamped at 0) so arable_ha + permanent_ha reconstructs cropland_ha exactly wherever FAO reports Arable <= Cropland. Where FAO reports ⁠Arable land⁠ but not ⁠Permanent crops⁠ (924 country-years, mostly arable-only economies) this yields the permanent land the survey implies; where it reports ⁠Permanent crops⁠ but not ⁠Arable land⁠ (a few coconut atolls) arable_ha is filled from ⁠Cropland - Permanent crops⁠.

Pre-1961 backcast: LUH2 annual cropland is c3ann + c4ann + c3nfx, perennial is c3per + c4per. For each country the perennial fraction and the cropland level are rescaled by their ratio to the LUH2 value at 1961 and multiplied by the FAO 1961 perennial fraction and cropland, so both match FAO exactly at the 1961 splice point and carry LUH2's earlier dynamics backwards. Countries without a FAO 1961 anchor receive no backcast.

Value

A tibble with one row per ⁠(area_code, year)⁠:

area_code: integer FAOSTAT area code (harmonised via polity_area_crosswalk; the FAOSTAT "China" aggregate 351 is dropped).
year: integer.
arable_ha, permanent_ha, cropland_ha: physical land area in hectares.
source: provenance, "fao" (>= 1961) or "luh2" (pre-1961 backcast).

Examples

get_arable_permanent_land(example = TRUE)
get_arable_permanent_land(example = TRUE)

Bilateral trade data

Description

Reports trade between pairs of countries in given years.

Usage

get_bilateral_trade(example = FALSE, cbs = NULL)
get_bilateral_trade(example = FALSE, cbs = NULL)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

cbs

Optional pre-computed wide CBS tibble from get_wide_cbs(). If NULL (default), it is built internally.

Value

A tibble with the reported trade between countries. For efficient memory usage, the tibble is not exactly in tidy format. It contains the following columns:

year: The year in which the recorded event occurred.
item_cbs_code: FAOSTAT internal code for the item that is being traded. For code details see e.g. add_item_cbs_name().
bilateral_trade: Square matrix of NxN dimensions where N is the total number of countries being considered. The matrix row and column names are exactly equal and they represent country codes.
- Row name: The code of the country where the data is from. For code details see e.g. add_area_name().
- Column name: FAOSTAT internal code for the country that is importing the item. See row name explanation above.
If m is the matrix, the value at m["A", "B"] is the trade in tonnes from country "A" to country "B", for the corresponding year and item. The matrix can be considered balanced. This means:
- The sum of all values from row "A", where "A" is any country, should match the total exports from country "A" reported in the commodity balance sheet (which is considered more accurate for totals).
- The sum of all values from column "A", where "A" is any country, should match the total imports into country "A" reported in the commodity balance sheet (which is considered more accurate for totals).
The sums may not be exactly the expected values because of precision issues and/or the iterative proportional fitting algorithm not converging fast enough, but should be relatively very close to the desired totals.

The step by step approach to obtain this data tries to follow the FABIO model and is explained below. All the steps are performed separately for each group of year and item.

From the FAOSTAT reported bilateral trade, there are sometimes two values for one trade flow: the exported amount claimed by the reporter country and the import amount claimed by the partner country. Here, the export data was preferred, i.e., if country "A" says it exported X tonnes to country "B" but country "B" claims they got Y tonnes from country "A", we trust the export data X. This choice is only needed if there exists a reported amount from both sides. Otherwise, the single existing report is chosen.
Complete the country data, that is, add any missing combinations of country trade with NAs, which will be estimated later. In the matrix form, this doesn't increase the memory usage since we had to build a matrix anyway (for the balancing algorithm), and the empty parts also take up memory. This is also done for total imports/exports from the commodity balance sheet, but these are directly filled with 0s instead.
The total imports and exports from the commodity balance sheet are balanced by downscaling the largest of the two to match the lowest. This is done in the following way:
- If total_imports > total_exports: Set import as total_exports * import / total_import.
- If total_exports > total_exports: Set export as total_exports * export / total_export.
The missing data in the matrix must be estimated. It's done like this:
- For each pair of exporter i and importer j, we estimate a bilateral trade m[i, j] using the export shares of i and import shares of j from the commodity balance sheet:
  - est_1 <- exports[i] * imports[j] / sum(imports), i.e., total exports of country i spread among other countries' import shares.
  - est_2 <- imports[j] * exports[i] / sum(exports), i.e. total imports of country j spread among other countries' export shares.
  - est <- (est_1 + est_2) / 2, i.e., the mean of both estimates.
  In the above computations, exports and imports are the original values before they were balanced.
- The estimates for data that already existed (i.e. non-NA) are discarded. For the ones left, for each row (i.e. exporter country), we get the difference between its balanced total export and the sum of original non-estimated data. The result is the gap we can actually fill with estimates, so as to not get past the reported total export. If the sum of non-discarded estimates is larger, it must be downscaled and spread by computing gap * non_discarded_estimate / sum(non_discarded_estimates).
- The estimates are divided by a trust factor, in the sense that we don't rely on the whole value, thinking that a non-present value might actually be because that specific trade was 0, so we don't overestimate too much. The chosen factor is 10%, so only 10% of the estimate's value is actually used to fill the NA from the original bilateral trade matrix.
The matrix is balanced, as mentioned before, using the iterative proportional fitting algorithm. The target sums for rows and columns are respectively the balanced exports and imports computed from the commodity balance sheet.

Examples

get_bilateral_trade(example = TRUE)
get_bilateral_trade(example = TRUE)

Get the per-crop physical cropland extension from spatialization inputs.

Description

Convenience wrapper that loads the gridded land-use inputs, spatializes crop harvested area with build_gridded_landuse() (crop-level, no CFT aggregation), and converts it to a per-crop physical land extension with build_crop_land_extension(). The result is keyed by ⁠(year, area_code, item_cbs_code)⁠ and ready to use as extensions in compute_footprint().

Usage

get_crop_land_extension(
  input_dir = NULL,
  years = NULL,
  method = c("cropland_apportion", "intensity_divide"),
  use_type_constraint = FALSE,
  fill_missing_patterns = TRUE,
  example = FALSE
)
get_crop_land_extension(
  input_dir = NULL,
  years = NULL,
  method = c("cropland_apportion", "intensity_divide"),
  use_type_constraint = FALSE,
  fill_missing_patterns = TRUE,
  example = FALSE
)

Arguments

input_dir

Directory holding the spatialization inputs (country_areas.parquet, crop_patterns.parquet, gridded_cropland.parquet, country_grid.parquet, and optionally multicropping.parquet). Typically ⁠<l_files_dir>/whep/inputs⁠. If NULL or unset, the pinned WHEP spatialization inputs are used.

years

Numeric vector of years to compute, or NULL for all available.

method

Physical-area conversion method passed to build_crop_land_extension().

use_type_constraint

If TRUE, restrict each crop to cells of its LUH2 type (requires type_cropland.parquet). Defaults to FALSE.

fill_missing_patterns

If TRUE (default), crops that have harvested area but no crop_patterns rows (e.g. Barley, absent from the Monfreda layer) are placed with a uniform fallback pattern over each producing country's cropland, so their land is not silently dropped.

example

If TRUE, return a small example output without reading remote/large data. Defaults to FALSE.

Value

A tibble with columns year, area_code, item_cbs_code, impact_u (physical land in hectares), and method_land.

Examples

get_crop_land_extension(example = TRUE)
get_crop_land_extension(example = TRUE)

Scrape activity data from FAOSTAT and post-process it

Description

Important: Dynamically allows for the introduction of subsets as "...".

Note: overhead by individually scraping FAOSTAT code QCL for crop data; it's fine.

Usage

get_faostat_data(activity_data, ..., example = FALSE)
get_faostat_data(activity_data, ..., example = FALSE)

Arguments

activity_data

activity data required from FAOSTAT; needs to be one of c('livestock','crop_area','crop_yield','crop_production').

...

can be whichever column name from get_faostat_bulk, particularly year, area or ISO3_CODE.

example

Logical. If TRUE, return a small hardcoded example tibble instead of scraping FAOSTAT. Useful for offline demos and documentation. Default FALSE.

Value

tibble of FAOSTAT for activity_data with columns area, item, element, year, value, unit and ISO3_CODE; default is for all years and countries.

Examples

get_faostat_data(example = TRUE)
get_faostat_data(example = TRUE)

Livestock feed intake

Description

Get amount of items used for feeding livestock.

Usage

get_feed_intake(
  example = FALSE,
  grain = c("national", "local"),
  demand_tier = c("ipcc", "fcr"),
  feed_mode = c("historical", "scenario"),
  years = NULL
)
get_feed_intake(
  example = FALSE,
  grain = c("national", "local"),
  demand_tier = c("ipcc", "fcr"),
  feed_mode = c("historical", "scenario"),
  years = NULL
)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

grain

Spatial grain of the feed allocation. "national" (default, one allocation per country) or "local" (the per-cell 0.5-degree engine, which is heavy and run via build_feed_intake_local(); calling it here redirects there).

demand_tier

Demand-estimation tier. "ipcc" (default, the rigorous IPCC Tier-2 energy demand for the ruminant species it covers, Bouwman FCR for pigs and poultry, Krausmann per-head for draft / other species) or "fcr" (the Bouwman / Krausmann feed-conversion magnitude for every species). Both grains allocate with redistribute_feed().

feed_mode

years

Integer vector of years to build, or NULL (default) for every year in the production data (1850-2023 via the LUH2 extension). Restricting the range cuts run time proportionally; allocation is independent per year, so a subset returns exactly the same rows for those years.

Value

A tibble with the feed intake data. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: The code of the country where the data is from. For code details see e.g. add_area_name().
live_anim_code: Commodity balance sheet code for the type of livestock that is fed. For code details see e.g. add_item_cbs_name().
item_cbs_code: The code of the item that is used for feeding the animal. For code details see e.g. add_item_cbs_name().
feed_type: The type of item that is being fed. It can be one of:
- animals: Livestock product, e.g. ⁠Bovine Meat⁠, ⁠Butter, Ghee⁠, etc.
- crops: Crop product, e.g. ⁠Vegetables, Other⁠, Oats, etc.
- residues: Crop residue, e.g. Straw, ⁠Fodder legumes⁠, etc.
- grass: Grass, e.g. Grassland, ⁠Temporary grassland⁠, etc.
- scavenging: Other residues. Single Scavenging item.
supply: The computed amount in tonnes of this item that should be fed to this animal, when sharing the total item feed use from the Commodity Balance Sheet among all livestock.
intake: The actual amount in tonnes that the animal needs, which can be less than the theoretical used amount from supply.
intake_dry_matter: The amount specified by intake but only considering dry matter, so it should be less than intake.
loss: The amount that is not used for feed. This is supply - intake.
loss_share: The percent that is lost. This is loss / supply.

Examples

get_feed_intake(example = TRUE)
get_feed_intake(example = TRUE)

Get WHEP polity geometries

Description

Returns the periodized polity database, including geometry. Pass polity_codes to retrieve a subset that can be joined to outputs from add_polity_code().

Usage

get_polity_geometries(polity_codes = NULL)
get_polity_geometries(polity_codes = NULL)

Arguments

polity_codes

Optional character vector of WHEP polity codes.

Value

An sf data frame.

Primary items production

Description

Get amount of crops, livestock and livestock products.

Usage

get_primary_production(example = FALSE)
get_primary_production(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the item production data. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: Legacy numeric reporting area code.
polity_area_code: Numeric WHEP reporting polity code used for matrix workflows. This currently matches area_code.
reporting_polity_code: WHEP polity code for the reporting polygon.
reporting_polity_name: WHEP polity name for the reporting polygon.
reporting_polity_has_geometry: Whether the reporting polity has a polygon in the WHEP polity database.
item_prod_code: FAOSTAT internal code for each produced item.
item_cbs_code: FAOSTAT internal code for each commodity balance sheet item. The commodity balance sheet contains an aggregated version of production items. This field is the code for the corresponding aggregated item.
live_anim_code: Commodity balance sheet code for the type of livestock that produces the livestock product. It can be:
- NA: The entry is not a livestock product.
- Non-NA: The code for the livestock type. The name can also be retrieved by using add_item_cbs_name().
unit: Measurement unit for the data. Here, keep in mind three groups of items: crops (e.g. ⁠Apples and products⁠, Beans...), livestock (e.g. ⁠Cattle, dairy⁠, Goats...) and livestock products (e.g. ⁠Poultry Meat⁠, ⁠Offals, Edible⁠...). Then the unit can be one of:
- tonnes: Available for crops and livestock products.
- ha: Hectares, available for crops.
- t_ha: Tonnes per hectare, available for crops.
- heads: Number of animals (stocks), available for livestock.
- slaughtered_heads: Number of animals slaughtered, available for livestock.
- LU: Standard Livestock Unit measure, available for livestock.
- t_head: tonnes per head, available for livestock products.
- t_LU: tonnes per Livestock Unit, available for livestock products.
value: The amount of item produced, measured in unit.

Examples

get_primary_production(example = TRUE)
get_primary_production(example = TRUE)

Crop residue items

Description

Get type and amount of residue produced for each crop production item.

Usage

get_primary_residues(example = FALSE)
get_primary_residues(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the crop residue data. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: The code of the country where the data is from. For code details see e.g. add_area_name().
item_cbs_code_crop: FAOSTAT internal code for each commodity balance sheet item. This is the crop that is generating the residue.
item_cbs_code_residue: FAOSTAT internal code for each commodity balance sheet item. This is the obtained residue. In the commodity balance sheet, this can be three different items right now:
- 2105: Straw
- 2106: ⁠Other crop residues⁠
- 2107: Firewood
These are actually not FAOSTAT defined items, but custom defined by us. When necessary, FAOSTAT codes are extended for our needs.
value: The amount of residue produced, measured in tonnes.

Examples

get_primary_residues(example = TRUE)
get_primary_residues(example = TRUE)

Processed products share factors

Description

Reports quantities of commodity balance sheet items used for processing and quantities of their corresponding processed output items.

Usage

get_processing_coefs(example = FALSE)
get_processing_coefs(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the quantities for each processed product. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: The code of the country where the data is from. For code details see e.g. add_area_name().
item_cbs_code_to_process: FAOSTAT internal code for each one of the items that are being processed and will give other subproduct items. For code details see e.g. add_item_cbs_name().
value_to_process: tonnes of this item that are being processed. It matches the amount found in the processing column from the data obtained by get_wide_cbs().
item_cbs_code_processed: FAOSTAT internal code for each one of the subproduct items that are obtained when processing. For code details see e.g. add_item_cbs_name().
initial_conversion_factor: estimate for the number of tonnes of item_cbs_code_processed obtained for each tonne of item_cbs_code_to_process. It will be used to compute the final_conversion_factor, which leaves everything balanced. TODO: explain how it's computed.
initial_value_processed: first estimate for the number of tonnes of item_cbs_code_processed obtained from item_cbs_code_to_process. It is computed as value_to_process * initial_conversion_factor.
conversion_factor_scaling: computed scaling needed to adapt initial_conversion_factor so as to get a final balanced total of subproduct quantities. TODO: explain how it's computed.
final_conversion_factor: final used estimate for the number of tonnes of item_cbs_code_processed obtained for each tonne of item_cbs_code_to_process. It is computed as initial_conversion_factor * conversion_factor_scaling.
final_value_processed: final estimate for the number of tonnes of item_cbs_code_processed obtained from item_cbs_code_to_process. It is computed as initial_value_processed * final_conversion_factor.

For the final data obtained, the quantities final_value_processed are balanced in the following sense: the total sum of final_value_processed for each unique tuple of ⁠(year, area_code, item_cbs_code_processed)⁠ should be exactly the quantity reported for that year, country and item_cbs_code_processed item in the production column obtained from get_wide_cbs(). This is because they are not primary products, so the amount from 'production' is actually the amount of subproduct obtained. TODO: Fix few data where this doesn't hold.

Examples

get_processing_coefs(example = TRUE)
get_processing_coefs(example = TRUE)

Retrieve a result's provenance record.

Description

Return the provenance record attached by attach_provenance(), or NULL when none is present.

Usage

get_provenance(x)
get_provenance(x)

Arguments

x

An object that may carry a whep_provenance attribute.

Value

The provenance tibble, or NULL.

Examples

get_provenance(tibble::tibble(value = 1))
get_provenance(tibble::tibble(value = 1))

Retrieve a result's scope record.

Description

Return the scope record attached by attach_scope(), or NULL when none is present.

Usage

get_scope(x)
get_scope(x)

Arguments

x

An object that may carry a whep_scope attribute.

Value

The scope tibble, or NULL.

Examples

get_scope(tibble::tibble(value = 1))
get_scope(tibble::tibble(value = 1))

Commodity balance sheet data.

Description

Retrieve supply and use parts for each commodity balance sheet (CBS) item. Stock variations are split into two non-negative columns following the FABIO methodology.

Usage

get_wide_cbs(example = FALSE)
get_wide_cbs(example = FALSE)

Arguments

example

If TRUE, return a small example output without downloading remote data. Default is FALSE.

Value

A tibble with the commodity balance sheet data in wide format. It contains the following columns:

year: The year in which the recorded event occurred.
area_code: The code of the country where the data is from. For code details see e.g. add_area_name().
item_cbs_code: FAOSTAT internal code for each item. For code details see e.g. add_item_cbs_name().

The other columns are quantities where total supply and total use should be balanced. Units are tonnes for most items, and heads for live animals (see items_cbs item_type).

For supply:

production: Produced locally.
import: Obtained from importing from other countries.
stock_withdrawal: Biomass taken out of storage (non-negative). Positive when stocks decrease.

For use:

food: Food for humans.
feed: Food for animals.
export: Released as export for other countries.
seed: Intended for new production.
processing: Used to obtain other subproducts.
other_uses: Any other use not included above.
stock_addition: Biomass placed into storage (non-negative). Positive when stocks increase.

There is an additional column domestic_supply which is computed as total use excluding export.

Examples

get_wide_cbs(example = TRUE)
get_wide_cbs(example = TRUE)

GLEAM animal weights.

Description

Typical live weights by region, species, system, and cohort.

Usage

gleam_animal_weights
gleam_animal_weights

Format

A tibble with region, species, system, cohort, weight_kg.

Source

MacLeod et al. (2018) GLEAM 3.0.

Examples

gleam_animal_weights
gleam_animal_weights

Nitrogen parameters for crop residues of feed materials.

Description

Nitrogen content of above- and below-ground residues and root-to-shoot ratios for feed materials.

Usage

gleam_crop_residue_nitrogen
gleam_crop_residue_nitrogen

Format

A tibble with columns:

material_number: Sequential material identifier.
material: Feed material code.
n_ag: Nitrogen content of above-ground residues.
rbg_bio: Ratio of below-ground residues to above-ground biomass.
n_bg: Nitrogen content of below-ground residues.
species_group: "ruminant" or "monogastric".

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.6.7 and S.6.8.

Examples

gleam_crop_residue_nitrogen
gleam_crop_residue_nitrogen

GLEAM crop residue parameters.

Description

Dry matter content and parameters for calculating crop residue yield by crop type.

Usage

gleam_crop_residue_params
gleam_crop_residue_params

Format

A tibble with columns:

crop: Crop name.
dry_matter_pct: Dry matter content (percent).
slope: Slope for residue yield calculation.
intercept: Intercept for residue yield calculation.

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.3.1. doi:10.1088/1748-9326/aad4d8

Examples

gleam_crop_residue_params
gleam_crop_residue_params

GLEAM dressing percentages.

Description

Carcass weight as percentage of live weight by species, production system, cohort, and GLEAM region. Includes country-specific overrides for industrial pig systems in Western Europe.

Usage

gleam_dressing_percentages
gleam_dressing_percentages

Format

A tibble with columns:

species: Animal species (Cattle, Buffaloes, Sheep, Goats, Pigs, Chicken).
production_system: Production system (Dairy, Beef, Backyard, Intermediate, Industrial, Layers, Broilers). NA for species without system breakdown.
cohort: Cohort (e.g. Adult and replacement female). NA for species without cohort breakdown.
country: Country name for country-specific values. NA for regional values.
gleam_region: GLEAM region abbreviation (NA, RUS, WE, EE, NENA, ESEA, OCE, SA, LAC, SSA).
dressing_percent: Dressing percentage.

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.9.1. doi:10.1088/1748-9326/aad4d8

Examples

gleam_dressing_percentages
gleam_dressing_percentages

Energy use emission factors for livestock production.

Description

Emission factors for embedded (feed-production) and direct (on-farm) energy use in livestock production, from GLEAM 3.0 tables S.7.1 through S.7.7. Note that the factors are expressed per kilogram of live weight, milk or egg depending on the species and herd: see the denominator column. The GLEAM footnotes are materialised as derived rows: embedded energy for meat (non-dairy) cattle and all buffalo is half of dairy cattle (S.7.1 note a); embedded energy for non-dairy small ruminants is half of the listed values (S.7.2 note a); direct energy for dairy small ruminants is double the dairy cattle values (S.7.5 note a).

Usage

gleam_energy_use_ef
gleam_energy_use_ef

Format

A tibble in long format with columns:

species: Animal species or group ("cattle", "buffalo", "large_ruminants", "small_ruminants", "pigs", "chickens").
herd: Herd or product line ("dairy", "non_dairy", "broilers", "layers", "all"). NA for pigs.
grouping: Country or country group as reported by GLEAM (e.g. "OECD", "EU 27", "Least developed countries").
grouping_scheme: Which country grouping the grouping belongs to: "development3" (OECD / least developed / others), "region5" (OECD / four non-OECD regions) or "detailed15" (individual OECD members plus world regions).
system: Production system (e.g. "grassland_based", "industrial"). NA when not applicable.
climate: Climate zone ("arid", "humid", "temperate"). NA when not applicable.
energy_type: "embedded" or "direct".
denominator: Reporting basis of the factor: "lw" (live weight), "milk" or "egg".
emission_factor: Emission factor in kg CO2-eq per kg of the denominator.

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.7.1 through S.7.7. doi:10.1088/1748-9326/aad4d8

Examples

gleam_energy_use_ef
gleam_energy_use_ef

GLEAM enteric fermentation parameters.

Description

Ym (% GE) values by species and production system. Feedlot cattle use 3.0% per IPCC 2019 Table 10.12.

Usage

gleam_enteric_params
gleam_enteric_params

Format

A tibble with species, system, ym_percent, notes.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.12.

Examples

gleam_enteric_params
gleam_enteric_params

GLEAM feed categories.

Description

Feed classification used in GLEAM 3.0.

Usage

gleam_feed_categories
gleam_feed_categories

Format

A tibble with feed_category, feed_type, description.

Source

MacLeod et al. (2018) GLEAM 3.0.

Examples

gleam_feed_categories
gleam_feed_categories

GLEAM feed use efficiency.

Description

Regional feed use efficiency (FUE) values for forages and crop residues of ruminant species.

Usage

gleam_feed_composition
gleam_feed_composition

Format

A tibble with columns:

feed_group: Feed material group (1-6 or 9-15).
feed_type: Feed type (mixed, grassland, or all).
gleam_region: GLEAM geographic region.
feed_use_efficiency: FUE value (0-1 fraction).

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.3.2.

Examples

gleam_feed_composition
gleam_feed_composition

GLEAM feed conversion ratios for monogastrics.

Description

Nutritional values for feed materials of monogastric species (chicken and pigs).

Usage

gleam_feed_conversion_ratios
gleam_feed_conversion_ratios

Format

A tibble with columns:

number: Feed material number.
material: Feed material code.
gross_energy_j_kg: Gross energy (J per kg).
n_content_g_kg: Nitrogen content (g per kg DM).
me_chicken_j_kg: Metabolisable energy for chicken (J per kg).
me_pigs_j_kg: Metabolisable energy for pigs (J per kg).
digestibility_pct: Digestibility (percent).

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.3.4.

Examples

gleam_feed_conversion_ratios
gleam_feed_conversion_ratios

GLEAM feed digestibility for ruminants.

Description

Nutritional values for feed materials of ruminant species, including gross energy, nitrogen content, and digestibility.

Usage

gleam_feed_digestibility
gleam_feed_digestibility

Format

A tibble with columns:

number: Feed material number.
material: Feed material code.
gross_energy_mj_kg: Gross energy (MJ per kg DM).
n_content_g_kg: Nitrogen content (g per kg DM).
digestibility_pct: Digestibility (percent).

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.3.3.

Examples

gleam_feed_digestibility
gleam_feed_digestibility

Emission factors for field operations on feed materials.

Description

CO2-equivalent emissions per hectare from field operations for ruminant and monogastric feed materials.

Usage

gleam_field_operation_ef
gleam_field_operation_ef

Format

A tibble with columns:

material_number: Sequential material identifier.
material: Feed material code (e.g. "GRASSF", "WHEAT").
emission_factor_kg_co2eq_ha: Emission factor in kg CO2-eq per hectare.
species_group: "ruminant" or "monogastric".

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.6.1 and S.6.2.

Examples

gleam_field_operation_ef
gleam_field_operation_ef

Country-level fraction of crop residues removed.

Description

Countries whose FracReMove value differs from the GLEAM default.

Usage

gleam_fracremove
gleam_fracremove

Format

A tibble with columns:

country: Country name.
continent: Continent.
region: GLEAM region.
fracremove: Fraction of crop residues removed (0 to 1).

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Table S.6.9.

Examples

gleam_fracremove
gleam_fracremove

GLEAM geographic hierarchy.

Description

Maps countries (ISO3) to GLEAM regions, FAOSTAT regions, and classification indicators.

Usage

gleam_geographic_hierarchy
gleam_geographic_hierarchy

Format

A tibble with columns:

iso3: ISO3 country code.
country: Country name.
continent: Continent.
faostat_region: FAOSTAT regional grouping.
gleam_region: GLEAM regional grouping.

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.A1-S.A2. doi:10.1088/1748-9326/aad4d8

Examples

gleam_geographic_hierarchy
gleam_geographic_hierarchy

GLEAM livestock categories.

Description

Species, production systems, and cohort definitions from GLEAM 3.0.

Usage

gleam_livestock_categories
gleam_livestock_categories

Format

A tibble with columns:

species: Animal species.
production_system: Dairy, Beef, Meat, etc.
cohort: Age/sex cohort.
description: Cohort description.

Source

MacLeod et al. (2018) GLEAM 3.0 Model Description.

Examples

gleam_livestock_categories
gleam_livestock_categories

Country-level mechanization levels for feed materials.

Description

Mechanization level by country for each feed material, for ruminant and monogastric species.

Usage

gleam_mechanization_levels
gleam_mechanization_levels

Format

A tibble in long format with columns:

country: Country name.
continent: Continent.
region: GLEAM region.
feed_material: Feed material code in lowercase.
mechanization_level: Numeric mechanization level.
species_group: "ruminant" or "monogastric".

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.6.3 and S.6.4.

Examples

gleam_mechanization_levels
gleam_mechanization_levels

GLEAM milk production.

Description

Average annual milk yields and lactation lengths by region.

Usage

gleam_milk_production
gleam_milk_production

Format

A tibble with region, species, system, milk_kg_head_yr, lactation_days.

Source

MacLeod et al. (2018) GLEAM 3.0.

Examples

gleam_milk_production
gleam_milk_production

GLEAM manure management system shares.

Description

Regional MMS allocation by species and system.

Usage

gleam_mms_shares
gleam_mms_shares

Format

A tibble with region, species, system, mms, share_percent.

Source

MacLeod et al. (2018) GLEAM 3.0.

Examples

gleam_mms_shares
gleam_mms_shares

Processing and transport emission factors for feeds.

Description

Emission factors for processing and transport of feed materials, for ruminant and monogastric species.

Usage

gleam_processing_transport_ef
gleam_processing_transport_ef

Format

A tibble with columns:

material_number: Sequential material identifier.
material: Feed material code.
processing_g_co2eq_kg_dm: Processing emission factor in g CO2-eq per kg dry matter.
transport_g_co2eq_kg_dm: Transport emission factor in g CO2-eq per kg dry matter.
species_group: "ruminant" or "monogastric".

Source

MacLeod et al. (2018) GLEAM 3.0 Supplement S1, Tables S.6.5 and S.6.6.

Examples

gleam_processing_transport_ef
gleam_processing_transport_ef

Accessibility and conversion parameters for grazable grass availability.

Description

Accessibility and conversion parameters for grazable grass availability.

Usage

grass_access_shares(aboveground = 0.46, grazable = 1, w_c_dm = 0.45)
grass_access_shares(aboveground = 0.46, grazable = 1, w_c_dm = 0.45)

Arguments

aboveground

Fraction of total grass NPP that is above ground (the grazable compartment; LPJmL pft_npp is whole-plant). Default 0.46.

grazable

Sustainable fraction of above-ground forage that can be grazed. Default 1 (the full above-ground ceiling); set below 1 to impose a sustainable-offtake share.

w_c_dm

Carbon-to-dry-matter mass fraction. Default 0.45.

Value

A named list with aboveground, grazable and w_c_dm.

Examples

grass_access_shares(grazable = 0.6)
grass_access_shares(grazable = 0.6)

Grazing energy coefficients.

Description

Walking energy cost for grazing animals (MJ/kg body weight/km).

Usage

grazing_energy_coefs
grazing_energy_coefs

Format

A tibble with parameter, value_mj_kg_km, source.

Source

NRC 2001 (0.00045 Mcal/kg/km converted to MJ).

Examples

grazing_energy_coefs
grazing_energy_coefs

Build agro-climatic, rainfed-gated fallow allocation weights.

Description

Compute a per-(area, item) weight for distributing reported fallow to crops, from gridded crop placement and a crop x agro-climatic-zone propensity. For each grid cell, rainfed crop area is multiplied by the crop's propensity in the cell's agro-climatic zone (derived from GAEZ length of growing period and thermal climate), then summed to country x item. Dryland cereals/pulses score high in arid/semi-arid zones, rainfed rice high in the humid tropics (rice-fallow); perennials and the irrigated share score ~zero.

Usage

gridded_fallow_weights(gridded_crops, grid_aez = NULL, propensity = NULL)
gridded_fallow_weights(gridded_crops, grid_aez = NULL, propensity = NULL)

Arguments

gridded_crops

Tibble keyed by grid cell and item_cbs_code with columns lon, lat, area_code, rainfed_ha.

grid_aez

Tibble of lon, lat, lgp (length of growing period in days), thermal (GAEZ thermal-climate class). If NULL, the packaged grid_aez.csv is used.

propensity

Tibble of item_cbs_code, zone, fallow_propensity. If NULL, the packaged fallow_propensity.csv is used.

Value

A tibble with area_code, item_cbs_code, weight.

Examples

gridded_crops <- tibble::tribble(
  ~lon, ~lat, ~area_code, ~item_cbs_code, ~rainfed_ha,
  0.25, 50.25, 1L, 2511L, 500
)
grid_aez <- tibble::tribble(~lon, ~lat, ~lgp, ~thermal, 0.25, 50.25, 100, 7L)
propensity <- tibble::tribble(
  ~item_cbs_code, ~zone, ~fallow_propensity,
  2511L, "semiarid", 0.8
)
gridded_fallow_weights(gridded_crops, grid_aez, propensity)
gridded_crops <- tibble::tribble(
  ~lon, ~lat, ~area_code, ~item_cbs_code, ~rainfed_ha,
  0.25, 50.25, 1L, 2511L, 500
)
grid_aez <- tibble::tribble(~lon, ~lat, ~lgp, ~thermal, 0.25, 50.25, 100, 7L)
propensity <- tibble::tribble(
  ~item_cbs_code, ~zone, ~fallow_propensity,
  2511L, "semiarid", 0.8
)
gridded_fallow_weights(gridded_crops, grid_aez, propensity)

Harmonize advanced cases with interpolation for 1:N groups

Description

Harmonize data containing "simple" and "1:n" mappings. "simple" covers both 1:1 and N:1 relationships (values are summed). For "1:n" groups (one original item splits into several harmonized items) this function computes value shares across the full year range, interpolates missing shares, and applies them to split values.

Important for 1:n mappings: For each original item that splits into multiple harmonized items (e.g., "wheatrice" into "wheat" and "rice"), provide one row per target item_code_harm. Each row should have the same item, year, and value, differing only in item_code_harm. For example, to disaggregate "wheatrice":

Row 1: item = "wheatrice", item_code_harm = 1
Row 2: item = "wheatrice", item_code_harm = 2

Do not provide a single row; the function will not create duplicates automatically.

Usage

harmonize_interpolate(data, ...)
harmonize_interpolate(data, ...)

Arguments

data

A data frame containing at least columns:

item: String, original item name.
item_code_harm: Numeric, code for harmonized item.
year: Numeric, year of observation.
value: Numeric, value of observation.
type: String, "simple" or "1:n".

...

Additional grouping columns provided as bare names.

Value

A tibble with columns:

item_code: Numeric, code for harmonized item.
year: Numeric, year of observation.
value: Numeric, summed value of observation.
and any additional grouping columns.

Examples

# Simple-only data (no 1:n rows)
df_simple <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheat", 1, 2000, 5, "simple",
  "barley", 2, 2000, 3, "simple",
  "oats", 2, 2000, 2, "simple"
)
harmonize_interpolate(df_simple)

# Mixed simple + 1:n data
df_mixed <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheatrice", 1, 2000, 20, "1:n",
  "wheatrice", 2, 2000, 20, "1:n",
  "wheat", 1, 2000, 8, "simple",
  "rice", 2, 2000, 12, "simple"
)
harmonize_interpolate(df_mixed)

# Multiple years with share interpolation
# Shares are known in 2000 and 2002; 2001 is interpolated.
df_years <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheat", 1, 2000, 6, "simple",
  "rice", 2, 2000, 4, "simple",
  "wheatrice", 1, 2001, 10, "1:n",
  "wheatrice", 2, 2001, 10, "1:n",
  "wheat", 1, 2002, 8, "simple",
  "rice", 2, 2002, 2, "simple"
)
harmonize_interpolate(df_years)

# With extra grouping columns
df_grouped <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type, ~country,
  "wheat", 1, 2000, 6, "simple", "usa",
  "rice", 2, 2000, 4, "simple", "usa",
  "wheatrice", 1, 2001, 10, "1:n", "usa",
  "wheatrice", 2, 2001, 10, "1:n", "usa",
  "wheat", 1, 2002, 8, "simple", "usa",
  "rice", 2, 2002, 2, "simple", "usa",
  "wheat", 1, 2002, 8, "simple", "germany"
)
harmonize_interpolate(df_grouped, country)
# Simple-only data (no 1:n rows)
df_simple <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheat", 1, 2000, 5, "simple",
  "barley", 2, 2000, 3, "simple",
  "oats", 2, 2000, 2, "simple"
)
harmonize_interpolate(df_simple)

# Mixed simple + 1:n data
df_mixed <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheatrice", 1, 2000, 20, "1:n",
  "wheatrice", 2, 2000, 20, "1:n",
  "wheat", 1, 2000, 8, "simple",
  "rice", 2, 2000, 12, "simple"
)
harmonize_interpolate(df_mixed)

# Multiple years with share interpolation
# Shares are known in 2000 and 2002; 2001 is interpolated.
df_years <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type,
  "wheat", 1, 2000, 6, "simple",
  "rice", 2, 2000, 4, "simple",
  "wheatrice", 1, 2001, 10, "1:n",
  "wheatrice", 2, 2001, 10, "1:n",
  "wheat", 1, 2002, 8, "simple",
  "rice", 2, 2002, 2, "simple"
)
harmonize_interpolate(df_years)

# With extra grouping columns
df_grouped <- tibble::tribble(
  ~item, ~item_code_harm, ~year, ~value, ~type, ~country,
  "wheat", 1, 2000, 6, "simple", "usa",
  "rice", 2, 2000, 4, "simple", "usa",
  "wheatrice", 1, 2001, 10, "1:n", "usa",
  "wheatrice", 2, 2001, 10, "1:n", "usa",
  "wheat", 1, 2002, 8, "simple", "usa",
  "rice", 2, 2002, 2, "simple", "usa",
  "wheat", 1, 2002, 8, "simple", "germany"
)
harmonize_interpolate(df_grouped, country)

Harmonize rows labeled "simple" by summing values

Description

Sum value for rows where type == "simple". This covers both 1:1 and N:1 item mappings, since in both cases the values are simply summed. The results are grouped by item_code_harm, year and any additional grouping columns supplied via ....

Usage

harmonize_simple(data, ...)
harmonize_simple(data, ...)

Arguments

data

A data frame containing at least columns:

item_code_harm: Numeric, code for harmonized item.
year: Numeric, year of observation.
value: Numeric, value of observation.
type: String, harmonization type. Only "simple" rows are used.

...

Additional grouping columns supplied as bare names.

Value

A tibble with columns:

item_code_harm: Numeric, code for harmonized item.
year: Numeric, year of observation.
value: Numeric, summed value of observation.
and any additional grouping columns.

Examples

# 1:1 mapping: one original item -> one harmonized code
df_one_to_one <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 10, "simple",
  2, 2000, 3, "simple",
  1, 2001, 12, "simple",
  2, 2001, 5, "simple"
)
harmonize_simple(df_one_to_one)

# N:1 mapping: multiple items map to the same code
df_many_to_one <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 4, "simple",
  1, 2000, 6, "simple",
  2, 2000, 3, "simple"
)
harmonize_simple(df_many_to_one)

# With an extra grouping column (e.g. country)
df_grouped <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type, ~country,
  1, 2000, 4, "simple", "usa",
  1, 2000, 6, "simple", "usa",
  1, 2000, 9, "simple", "germany",
  2, 2000, 3, "simple", "usa"
)
harmonize_simple(df_grouped, country)

# Rows with type != "simple" are ignored
df_mixed <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 10, "simple",
  1, 2000, 99, "1:n",
  2, 2000, 3, "simple"
)
harmonize_simple(df_mixed)
# 1:1 mapping: one original item -> one harmonized code
df_one_to_one <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 10, "simple",
  2, 2000, 3, "simple",
  1, 2001, 12, "simple",
  2, 2001, 5, "simple"
)
harmonize_simple(df_one_to_one)

# N:1 mapping: multiple items map to the same code
df_many_to_one <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 4, "simple",
  1, 2000, 6, "simple",
  2, 2000, 3, "simple"
)
harmonize_simple(df_many_to_one)

# With an extra grouping column (e.g. country)
df_grouped <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type, ~country,
  1, 2000, 4, "simple", "usa",
  1, 2000, 6, "simple", "usa",
  1, 2000, 9, "simple", "germany",
  2, 2000, 3, "simple", "usa"
)
harmonize_simple(df_grouped, country)

# Rows with type != "simple" are ignored
df_mixed <- tibble::tribble(
  ~item_code_harm, ~year, ~value, ~type,
  1, 2000, 10, "simple",
  1, 2000, 99, "1:n",
  2, 2000, 3, "simple"
)
harmonize_simple(df_mixed)

Indirect N2O emission factors.

Description

Parameters for indirect N2O emissions from manure management: EF4 (volatilization), EF5 (leaching), FracGasMS, FracLeach.

Usage

indirect_n2o_ef
indirect_n2o_ef

Format

A tibble with parameter, value, description.

Source

IPCC 2019, Vol 4, Ch 10, Table 10.22; Vol 4, Ch 11, Table 11.3.

Examples

indirect_n2o_ef
indirect_n2o_ef

IPCC 2006 Tier 1 enteric emission factors.

Description

Table 10.11 (2006): Tier 1 regional EFs for enteric fermentation.

Usage

ipcc_2006_enteric_ef
ipcc_2006_enteric_ef

Format

A tibble with region, category, ef_kg_head_yr.

Source

IPCC 2006, Vol 4, Ch 10, Table 10.11.

Examples

ipcc_2006_enteric_ef
ipcc_2006_enteric_ef

IPCC 2006 Tier 1 manure emission factors.

Description

Table 10.14 (2006): Tier 1 regional EFs for manure CH4.

Usage

ipcc_2006_manure_ef
ipcc_2006_manure_ef

Format

A tibble with region, category, ef_kg_head_yr, temp_zone.

Source

IPCC 2006, Vol 4, Ch 10, Table 10.14.

Examples

ipcc_2006_manure_ef
ipcc_2006_manure_ef

IPCC 2006 MCF by temperature.

Description

Table 10.17 (2006): MCF values by MMS type and annual temperature.

Usage

ipcc_2006_mcf_temp
ipcc_2006_mcf_temp

Format

A tibble with system, temp_c, mcf_percent.

Source

IPCC 2006, Vol 4, Ch 10, Table 10.17.

Examples

ipcc_2006_mcf_temp
ipcc_2006_mcf_temp

IPCC 2019 Bo values (Table 10.16).

Description

Maximum CH4 producing capacity of manure (m3 CH4/kg VS). Dairy cattle (0.24) differs from other cattle (0.18).

Usage

ipcc_2019_bo
ipcc_2019_bo

Format

A tibble with category, bo_m3_kg_vs.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.16.

Examples

ipcc_2019_bo
ipcc_2019_bo

IPCC 2019 Cfi values (Table 10.4).

Description

Net energy maintenance coefficients (MJ/day/kg^0.75). Dairy (lactating) cattle use 0.386; non-dairy 0.322.

Usage

ipcc_2019_cfi
ipcc_2019_cfi

Format

A tibble with category, subcategory, cfi_mj_day_kg075.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.4.

Examples

ipcc_2019_cfi
ipcc_2019_cfi

IPCC 2019 enteric EF for cattle.

Description

Table 10.10: Tier 1 enteric fermentation emission factors for cattle by region (kg CH4/head/yr).

Usage

ipcc_2019_enteric_ef_cattle
ipcc_2019_enteric_ef_cattle

Format

A tibble with region, category, ef_kg_head_yr, source.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.10.

Examples

ipcc_2019_enteric_ef_cattle
ipcc_2019_enteric_ef_cattle

IPCC 2019 enteric EF for non-cattle.

Description

Table 10.11: Tier 1 enteric fermentation emission factors for non-cattle species (kg CH4/head/yr).

Usage

ipcc_2019_enteric_ef_other
ipcc_2019_enteric_ef_other

Format

A tibble with category, ef_kg_head_yr, source.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.11.

Examples

ipcc_2019_enteric_ef_other
ipcc_2019_enteric_ef_other

IPCC 2019 manure CH4 EF for cattle.

Description

Table 10.14: Tier 1 manure management CH4 emission factors for cattle by region (kg CH4/head/yr).

Usage

ipcc_2019_manure_ch4_ef_cattle
ipcc_2019_manure_ch4_ef_cattle

Format

A tibble with region, category, ef_kg_head_yr.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.14.

Examples

ipcc_2019_manure_ch4_ef_cattle
ipcc_2019_manure_ch4_ef_cattle

IPCC 2019 manure CH4 EF for non-cattle.

Description

Table 10.14: Tier 1 manure management CH4 emission factors for non-cattle species (kg CH4/head/yr).

Usage

ipcc_2019_manure_ch4_ef_other
ipcc_2019_manure_ch4_ef_other

Format

A tibble with category, ef_kg_head_yr.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.14.

Examples

ipcc_2019_manure_ch4_ef_other
ipcc_2019_manure_ch4_ef_other

IPCC 2019 MCF for manure management.

Description

Table 10.17: Methane Conversion Factors by manure management system and annual average temperature.

Usage

ipcc_2019_mcf_manure
ipcc_2019_mcf_manure

Format

A tibble with system, annual_temp_c, mcf_percent.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.17.

Examples

ipcc_2019_mcf_manure
ipcc_2019_mcf_manure

IPCC 2019 nitrogen excretion rates.

Description

Table 10.19: Daily N excretion rates by species and region (kg N/1000 kg animal mass/day).

Usage

ipcc_2019_n_excretion
ipcc_2019_n_excretion

Format

A tibble with region, category, nex_kg_per_1000kg_day.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.19.

Examples

ipcc_2019_n_excretion
ipcc_2019_n_excretion

IPCC 2019 direct N2O emission factors.

Description

Table 10.21: EF3 values (kg N2O-N/kg N) by manure management system.

Usage

ipcc_2019_n2o_ef_direct
ipcc_2019_n2o_ef_direct

Format

A tibble with mms_type, ef3_kg_n2on_kg_n, source.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.21.

Examples

ipcc_2019_n2o_ef_direct
ipcc_2019_n2o_ef_direct

IPCC 2019 Ym values (Table 10.12).

Description

Methane conversion rate (% GE) by species and feed situation. The 2019 Refinement differentiates:

Cattle feedlot (>90% concentrate): 3.0%.
Sheep >= 75 kg body weight: 6.7%.
Sheep < 75 kg body weight: 4.7%.

Usage

ipcc_2019_ym
ipcc_2019_ym

Format

A tibble with category, feed_situation, ym_percent.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.12.

Examples

ipcc_2019_ym
ipcc_2019_ym

Tier 2 Bo values.

Description

Maximum CH4 producing capacity by detailed category. Dairy cattle 0.24 vs other cattle 0.18.

Usage

ipcc_tier2_bo_values
ipcc_tier2_bo_values

Format

A tibble with category, bo_m3_kg_vs.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.16.

Examples

ipcc_tier2_bo_values
ipcc_tier2_bo_values

Tier 2 energy coefficients.

Description

Coefficients for IPCC Tier 2 GE calculation including Cfi (maintenance), Ca (activity), Cp (pregnancy), Cw (work), and energy content of weight gain. Now includes subcategory column to differentiate dairy (lactating) vs non-dairy cattle.

Usage

ipcc_tier2_energy_coefs
ipcc_tier2_energy_coefs

Format

A tibble with columns:

category: Species (Cattle, Buffalo, Sheep, etc.).
subcategory: Dairy, Non-Dairy, or All.
cfi_mj_day_kg075: NEm coefficient (MJ/day/kg^0.75).
ca_pasture: Activity coefficient for grazing.
ca_feedlot: Activity coefficient for confined.
cp: Pregnancy coefficient.
cw: Work coefficient.
energy_content_gain_mj_kg: Energy per kg gain.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Eq 10.3-10.16; Tables 10.4-10.5.

Examples

ipcc_tier2_energy_coefs
ipcc_tier2_energy_coefs

Tier 2 manure ash content.

Description

Ash content of manure as percent of dry matter, used in VS calculation (Eq 10.24).

Usage

ipcc_tier2_manure_ash
ipcc_tier2_manure_ash

Format

A tibble with category, ash_percent.

Source

IPCC 2019 Refinement, Vol 4, Ch 10.

Examples

ipcc_tier2_manure_ash
ipcc_tier2_manure_ash

Tier 2 nitrogen retention fractions.

Description

Fraction of N intake retained in animal products. Dairy cattle 0.20 vs other cattle 0.07.

Usage

ipcc_tier2_n_retention
ipcc_tier2_n_retention

Format

A tibble with category, n_retention_frac.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.20.

Examples

ipcc_tier2_n_retention
ipcc_tier2_n_retention

Tier 2 Ym values.

Description

Methane conversion rate by species and feed situation for Tier 2 enteric CH4. Includes feedlot distinction and sheep body weight differentiation.

Usage

ipcc_tier2_ym_values
ipcc_tier2_ym_values

Format

A tibble with category, feed_situation, ym_percent.

Source

IPCC 2019 Refinement, Vol 4, Ch 10, Table 10.12.

Examples

ipcc_tier2_ym_values
ipcc_tier2_ym_values

Commodity balance sheet items

Description

Defines name/code correspondences for commodity balance sheet (CBS) items.

Usage

items_cbs
items_cbs

Format

A tibble where each row corresponds to one CBS item. It contains the following columns:

item_cbs_code: A numeric code used to refer to the CBS item.
item_cbs_name: A natural language name for the item.
item_type: An ad-hoc grouping of items. This is a work in progress evolving depending on our needs, so for now it only has two possible values:
- livestock: The CBS item represents a live animal.
- other: Not any of the previous groups.

Source

Inspired by FAOSTAT data.

Full CBS item table

Description

Extended item reference table covering all CBS items, including their process and commodity codes, feed type classifications, and default material flow destinations.

Usage

items_full
items_full

Format

A tibble where each row corresponds to one CBS item. It contains the following columns:

item_cbs: Name of the CBS item.
item_cbs_code: Numeric CBS item code.
comm_code: Commodity code used in process-based modelling (may contain "#N/A" when not applicable).
proc_code: Process code (may contain "#N/A" when not applicable).
proc: Process name (may contain "#N/A" when not applicable).
unit: Measurement unit (typically "tonnes").
group: Broad item group. Common values include "Additives", "Crop products", "Crop residues", "Draught", "Fish", "Forestry", "Grass", "Livestock", and others.
feedtype_graniv: Feed type classification for granivores (e.g., "additives", "concentrates", "roughages").
feedtype_grazers: Feed type classification for grazers.
comm_group: Sub-group of the commodity (e.g., "Additives", "Alcohol", "Ethanol", "Oil cakes", "Other processing residues").
Cat_1: Primary category label used in material flow accounting.
Name_biomass: Corresponding item name in biomass_coefs, enabling joins with the biomass coefficient table.
dbMFA_items: Item identifier used in the material flow analysis database.
FEDNA: Item name used in FEDNA feed composition tables.
default_destiny: Default CBS use category for this item. One of "Feed", "Food", "Other_uses", "Processing", or NA.

Source

Derived from FAOSTAT data and internal commodity classification work.

Examples

head(items_full)
head(items_full)

Primary production items linked to CBS

Description

Maps FAOSTAT primary production items and crop products to their CBS counterparts, along with farm and labour classifications.

Usage

items_prim
items_prim

Format

A tibble where each row corresponds to one production item. It contains the following columns:

item_prod: Name of the production item (e.g., "Wheat", "Rice").
item_prod_code: FAOSTAT production item code (character).
item_cbs: Name of the corresponding CBS item.
item_cbs_code: Numeric CBS item code.
Farm_class: Farm system classification. Crop items use codes such as "COP" (cereals, oilseeds, protein crops), "Vegetables", "Fruits", "Olive", "Grapevine", "Other_crops". Livestock items use "Dairy_cows", "Cattle", "Monogastric", "Sheep_goats", "Bees", "Game". NA for non-farm items.
Cat_Labour: Labour category used in agricultural labour analyses.
Cat_FAO1: Top-level FAO commodity category.
group: Item group classification. One of "Primary crops", "Crop products", "Livestock products", "Grass", "Crop residues", "Scavenging", "Livestock".

Source

Derived from FAOSTAT Production data.

Examples

head(items_prim)
head(items_prim)

Primary production items

Description

Defines name/code correspondences for production items.

Usage

items_prod
items_prod

Format

A tibble where each row corresponds to one production item. It contains the following columns:

item_prod_code: A numeric code used to refer to the item.
item_prod_name: A natural language name for the item.
item_type: An ad-hoc grouping of items. This is a work in progress evolving depending on our needs, so for now it only has two possible values:
- crop_product: The CBS item represents a crop product.
- other: Not any of the previous groups.

Source

Inspired by FAOSTAT data.

Full production item table

Description

Comprehensive reference table for all production items, combining CBS linkages, biomass names, multiple classification schemes, and crop ecological traits.

Usage

items_prod_full
items_prod_full

Format

A tibble where each row corresponds to one production item. It contains the following columns:

item_prod: Name of the production item.
item_prod_code: FAOSTAT production item code (character).
item_cbs: Name of the corresponding CBS item.
item_cbs_code: Numeric CBS item code.
group: Item group. One of "Primary crops", "Crop products", "Livestock products", "Crop residues", "Grassland", "Scavenging".
live_anim: Name of the parent live animal for livestock-derived items (NA for crop items).
live_anim_code: Numeric CBS code of the parent live animal (NA for crop items).
Cat_Krausmann: Item category used in Krausmann et al. biomass flow accounting.
Name_biomass: Corresponding item name in biomass_coefs, enabling joins for biomass coefficients.
Name_Eurostat: Corresponding item name in Eurostat agricultural statistics.
Name: Alternative or display name for the item.
Cat_Labour: Labour category used in agricultural labour analyses.
Cat_FAO1: Top-level FAO commodity category (e.g., "Cereals", "Oilcrops").
Cat_Origin: Origin-based commodity category. One of "Cereals", "Vegetables and Fruits", "Sugar and Stimulants", "Oil Crops", "Fodder crops", "Fibres and Crude Materials", or NA.
Cat_Use: Use-based commodity category (e.g., "Grains", "Oils and Fats", "Fodder crops", "Beverages, sugar and stimulants").
Order: Numeric ordering field used for sorting items consistently across outputs.
Categ: General category label used in some analyses.
Farm_class: Farm system classification (see items_prim for values).
c3_c4: Photosynthetic pathway. One of "c3", "c4", or NA.
ann_per_nfx: Annual/perennial and nitrogen fixation trait. One of "ann" (annual), "per" (perennial), "nfx" (nitrogen-fixing), or NA.
Cat_1: Primary category label for material flow accounting.
Cat_2: Secondary category label.
Cat_3: Tertiary category label.
Cat_4: Quaternary category label.
Herb_Woody: Plant growth form. One of "Herbaceous", "Woody", or NA.
Crop_irrig: Irrigation category used in water-use analyses.
Cat_Org: Organic farming category classification.
Cat_Ymax: Maximum attainable yield category.
Cat_Ymax_leg: Legend label for the Cat_Ymax category.

Source

Derived from FAOSTAT Production data and multiple classification schemes from the literature.

Examples

head(items_prod_full)
head(items_prod_full)

Grassland share of synthetic nitrogen by country and year

Description

Country-level time series of the share of synthetic nitrogen applied to grassland (versus cropland). Used to split national N totals between land use types in the WHEP nitrogen pipeline.

Usage

lassaletta_grassland_share
lassaletta_grassland_share

Format

A tibble with one row per country-year combination containing:

Country: Country name.
year: Year (numeric).
grass_share: Share of synthetic N applied to grassland (0–1).

Source

Lassaletta et al. nitrogen flow dataset. See pipeline documentation for full citation.

Examples

head(lassaletta_grassland_share)
head(lassaletta_grassland_share)

Livestock unit coefficients

Description

Provides livestock unit (LU) conversion factors per head for each animal class, used to express heterogeneous livestock populations in comparable units.

Usage

liv_lu_coefs
liv_lu_coefs

Format

A tibble where each row corresponds to one animal class. It contains the following columns:

Animal_class: Animal class identifier (e.g., "Dairy_cows", "Cattle", "Sheep_goats", "Broilers", "Hens", "Pigs").
LU_head: Livestock units per head (numeric). Dairy cows have a value of 1.0 by convention; smaller animals have proportionally lower values.

Source

Based on standard livestock unit definitions from FAO and European agricultural statistics.

Examples

head(liv_lu_coefs)
head(liv_lu_coefs)

Livestock physical constants.

Description

Named list of physical constants used in livestock emission calculations:

energy_content_ch4_mj_kg: 55.65 MJ/kg CH4.
ch4_density_kg_m3: 0.67 kg/m3.
vs_energy_content_mj_kg: 18.45 MJ/kg DM.
n_to_n2o: 44/28 (N to N2O molecular mass ratio).
days_in_year: 365.
default_de_percent: 65%.
default_ue_fraction: 0.04 (urinary energy as fraction of GE).
ev_wool_mj_kg: 24.0 MJ/kg clean wool.

Usage

livestock_constants
livestock_constants

Format

A named list.

Source

IPCC 2019 Refinement, Vol 4, Ch 10.

Examples

str(livestock_constants)
str(livestock_constants)

Default production parameters.

Description

Default values for fat%, protein%, lactose%, weight gain, work hours, and pregnancy fraction by species.

Usage

livestock_production_defaults
livestock_production_defaults

Format

A tibble with columns:

category: Species or animal class.
fat_percent: Milk fat content (percent).
protein_percent: Milk protein content (percent).
lactose_percent: Milk lactose content (percent).
weight_gain_kg_day: Average daily weight gain (kg/day).
work_hours_day: Hours of draft work per day.
pregnant_fraction: Fraction of females pregnant.

Source

NRC 2001; IPCC 2019, Vol 4, Ch 10.

Examples

livestock_production_defaults
livestock_production_defaults

Build an LPJmL/WHEP spatial covariate function

Description

Creates a ⁠function(centroids_sf, year)⁠ suitable for APIs that accept a spatial covariate density, such as build_constant_territory_series(). The covariate is built from prepared WHEP/LPJmL spatialization inputs, either from a local input_dir or from pinned WHEP spatialization inputs when input_dir = NULL.

Usage

make_lpjml_covariate(
  input_dir = NULL,
  years = NULL,
  weighting = c("total_cropland", "crop_pattern"),
  item_prod_code = NULL,
  cft_mapping = whep::cft_mapping
)
make_lpjml_covariate(
  input_dir = NULL,
  years = NULL,
  weighting = c("total_cropland", "crop_pattern"),
  item_prod_code = NULL,
  cft_mapping = whep::cft_mapping
)

Arguments

input_dir

Directory holding prepared WHEP/LPJmL input parquets. If NULL, the function reads pinned spatialization inputs via whep_read_file().

years

Integer vector of years to retain. If NULL, all years present in the required input are used.

weighting

Either "total_cropland", "crop_pattern", or a custom ⁠function(centroids_sf, year)⁠. A custom function is returned unchanged.

item_prod_code

FAOSTAT/WHEP production item code. Required when weighting = "crop_pattern".

cft_mapping

Mapping from WHEP item codes to CFT/LUH2 types. Defaults to cft_mapping.

Value

A function ⁠function(centroids_sf, year)⁠ returning non-negative density values aligned to centroids_sf.

Examples

# A custom covariate function is returned unchanged, ready to plug into
# build_constant_territory_series(). Here a uniform (area-weighting) density:
uniform <- function(centroids_sf, year) rep(1, nrow(centroids_sf))
covariate <- make_lpjml_covariate(weighting = uniform)
identical(covariate, uniform)

# The "total_cropland" and "crop_pattern" modes instead read prepared
# WHEP/LPJmL spatialization parquets from `input_dir` (or pinned inputs).
# A custom covariate function is returned unchanged, ready to plug into
# build_constant_territory_series(). Here a uniform (area-weighting) density:
uniform <- function(centroids_sf, year) rep(1, nrow(centroids_sf))
covariate <- make_lpjml_covariate(weighting = uniform)
identical(covariate, uniform)

# The "total_cropland" and "crop_pattern" modes instead read prepared
# WHEP/LPJmL spatialization parquets from `input_dir` (or pinned inputs).

Maximum intake shares.

Description

Per livestock category diet share caps that limit how much of a diet a feed item or feed class may make up. Migrated from the afsetools Livestock_coefs.xlsx workbook.

Usage

max_intake_share
max_intake_share

Format

A tibble with one row per cap:

livestock_category: Livestock category the cap applies to.
var: Cap key type: item_cbs (per item) or Cat_feed (per feed class).
var_value: Item or feed class the cap applies to.
max_intake_share: Maximum share of the diet (fraction).

Source

afsetools Livestock_coefs.xlsx.

Melt a bilateral trade matrix to long format.

Description

Convert the bilateral_trade list-column of get_bilateral_trade() (one square origin-by-destination matrix per year and item) into the tidy from_code/to_code long form consumed by compute_footprint_balance(). Self-trade (diagonal) entries are dropped.

Usage

melt_bilateral_trade(bilateral_trade)
melt_bilateral_trade(bilateral_trade)

Arguments

bilateral_trade

Tibble from get_bilateral_trade(), with year, item_cbs_code and a bilateral_trade matrix list-column.

Value

A tibble with year, from_code, to_code, item_cbs_code and value.

Examples

m <- matrix(
  c(0, 40, 0, 0),
  nrow = 2,
  dimnames = list(c("1", "2"), c("1", "2"))
)
bt <- tibble::tibble(
  year = 2010L, item_cbs_code = 10L, bilateral_trade = list(m)
)
melt_bilateral_trade(bt)
m <- matrix(
  c(0, 40, 0, 0),
  nrow = 2,
  dimnames = list(c("1", "2"), c("1", "2"))
)
bt <- tibble::tibble(
  year = 2010L, item_cbs_code = 10L, bilateral_trade = list(m)
)
melt_bilateral_trade(bt)

Synthetic nitrogen application rates by crop and country

Description

Country- and crop-process-level synthetic nitrogen application rates (kg N ha $^{-1}$ ), derived from Mueller et al. (2012). Used as reference crop-specific N rates in the WHEP nitrogen pipeline.

Usage

mueller_synthetic_n
mueller_synthetic_n

Format

A tibble with one row per crop-process-country combination containing:

proc_code: Internal process code (e.g. "p001").
crop_process: Descriptive crop process name (e.g. "Rice production").
crop_original: Crop name as in the source dataset.
unit: Unit of the rate value (always "kgN/ha").
iso3c: ISO 3166-1 alpha-3 country code.
rate_value: Nitrogen application rate (kg N ha $^{-1}$ ).

Source

Mueller, N. D. et al. (2012). Closing yield gaps through nutrient and water management. Nature, 490(7419), 254–257. doi:10.1038/nature11420

Examples

head(mueller_synthetic_n)
head(mueller_synthetic_n)

Interactive footprint Sankey viewer

Description

Create a browser-based Sankey viewer from a footprint table, such as the output of compute_footprint(). The viewer includes stage filters, a per-stage node limit controls, a minimum-flow threshold, hover tooltips, click-to-highlight interactions, and SVG download.

Each row is treated as one path through the selected stages, with value_col giving the path size. Rows are summed before plotting, so repeated paths are shown as one flow.

The default stages are origin_area, origin_item, target_item, and target_area; target_fd is appended automatically when present. If columns such as origin_area_name or target_item_name are present, they are used as display labels.

Usage

plot_footprint_sankey(
  footprints,
  stages = NULL,
  value_col = "value",
  label_cols = NULL,
  max_nodes = 10,
  other_label = "Other",
  stage_max_nodes = NULL,
  stage_other_labels = NULL,
  embed_max_nodes = Inf,
  stage_embed_max_nodes = NULL,
  min_share = 0,
  title = "Footprint Sankey Viewer",
  subtitle = NULL,
  value_label = "footprint",
  width = "100%",
  height = 680,
  file = NULL,
  open = FALSE
)
plot_footprint_sankey(
  footprints,
  stages = NULL,
  value_col = "value",
  label_cols = NULL,
  max_nodes = 10,
  other_label = "Other",
  stage_max_nodes = NULL,
  stage_other_labels = NULL,
  embed_max_nodes = Inf,
  stage_embed_max_nodes = NULL,
  min_share = 0,
  title = "Footprint Sankey Viewer",
  subtitle = NULL,
  value_label = "footprint",
  width = "100%",
  height = 680,
  file = NULL,
  open = FALSE
)

Arguments

footprints

A data frame with footprint paths and a numeric value column.

stages

Character vector of columns to use as Sankey stages. Defaults to the footprint origin and target columns.

value_col

Name of the numeric column containing flow values.

label_cols

Optional named character vector mapping stage columns to display-label columns, for example c(origin_area = "origin_area_name"). When omitted, ⁠{stage}_name⁠ columns are used where available.

max_nodes

Maximum number of individual nodes to show per stage in the current filtered browser view. Less important nodes are grouped into other_label. Use Inf to show all nodes.

other_label

Label used for grouped nodes when max_nodes is finite.

stage_max_nodes

Optional named numeric vector overriding max_nodes for selected stages, for example c(product = 75, target_area = 12).

stage_other_labels

Optional named character vector overriding other_label for selected stages, for example c(product = "Other products").

embed_max_nodes

Maximum number of individual nodes to embed per stage before writing the viewer. Less important nodes are permanently grouped before serialization, reducing file size. Use Inf to keep all nodes available to the browser.

stage_embed_max_nodes

Optional named numeric vector overriding embed_max_nodes for selected stages.

min_share

Minimum visible path size as a percentage of the current filtered total. Set to 0 to keep diffuse trade links visible. Users can change this in the browser.

title

Optional viewer title.

subtitle

Optional viewer subtitle.

value_label

Label used in tooltips and the summary line.

width

Viewer width as a CSS value.

height

Sankey SVG height in pixels.

file

Optional path to write a standalone HTML viewer.

open

If TRUE and file is supplied, open the written viewer in the default browser. If TRUE and file is NULL, a temporary HTML file is written and opened.

Value

A browsable HTML object when file is NULL and open is FALSE; otherwise invisibly returns the HTML file path.

Examples

footprints <- tibble::tribble(
  ~origin_area, ~origin_item, ~target_item, ~target_area, ~value,
  "Brazil", "Soybeans", "Pigmeat", "China", 10,
  "Brazil", "Soybeans", "Milk", "China", 4,
  "Brazil", "Soybeans", "Soybean Oil", "France", 3
)

if (
  interactive() &&
    requireNamespace("htmltools", quietly = TRUE) &&
    requireNamespace("jsonlite", quietly = TRUE)
) {
  plot_footprint_sankey(footprints)
}
footprints <- tibble::tribble(
  ~origin_area, ~origin_item, ~target_item, ~target_area, ~value,
  "Brazil", "Soybeans", "Pigmeat", "China", 10,
  "Brazil", "Soybeans", "Milk", "China", 4,
  "Brazil", "Soybeans", "Soybean Oil", "France", 3
)

if (
  interactive() &&
    requireNamespace("htmltools", quietly = TRUE) &&
    requireNamespace("jsonlite", quietly = TRUE)
) {
  plot_footprint_sankey(footprints)
}

Polities

Description

Periodized WHEP polity database imported from the whep-polities repository.

Usage

polities
polities

Format

An sf data frame where each row corresponds to one territorial polity over a continuous time interval. Key columns include:

polity_code: Stable WHEP polity identifier, usually PREFIX-start_year-end_year.
polity_name: Human-readable polity name.
start_year, end_year: Inclusive validity years for the row.
iso3_code, iso3c: ISO3 code where one exists. iso3c is retained as a compatibility alias.
polygon_status: Polygon status in whep-polities ("assigned", "proxy", "missing", or "excluded").
has_geometry: Logical flag indicating whether the geometry is non-empty.
geom: Multipolygon geometry.

Source

⁠~/whep-polities/data/final/polities_database.gpkg⁠.

Polity categories and regional classifications

Description

Reference table for countries and political entities (polities) with identifiers from multiple data sources and assignments to various regional groupings used in the literature and international databases.

Usage

polities_cats
polities_cats

Format

A tibble where each row corresponds to one polity (country or territory). It contains the following columns:

polity_code: Legacy current polity prefix, usually ISO 3166-1 alpha-3 (e.g., "AFG", "ALB").
polity_name: Current polity, country, or territory name.
V1: Internal row index from the source table.
code: Numeric FAOSTAT country code.
polity_area_code: Numeric WHEP reporting area code used in matrix workflows.
reporting_polity_code: Current periodized WHEP polity code for code.
reporting_polity_name: Current WHEP polity name for code.
reporting_polity_has_geometry: Logical flag indicating whether the current reporting polity has a polygon.
iso3c: ISO 3166-1 alpha-3 code (character; may duplicate polity_code or differ for aggregates).
FAOSTAT_name: Country name as used in FAOSTAT.
EU27: Logical flag; TRUE if the polity is a member of the EU27.
name: Country name used in other external databases.
eia: Country name or code used by the US Energy Information Administration (EIA).
iea: Country identifier used by the International Energy Agency (IEA).
water_code: Numeric code used in water statistics datasets.
water_area: Country/area name used in water statistics.
baci: Numeric BACI trade database country code.
fish: Numeric code used in fisheries datasets.
region_code: Numeric regional grouping code.
cbs: Logical flag; TRUE if the polity is included in the CBS dataset.
fabio_code: Numeric country code used in the FABIO database.
ADB_Region: Asian Development Bank regional classification.
region: General world region (e.g., "South Asia", "Eastern Europe").
uISO3c: Numeric Unicode / UN M49 country code.
Lassaletta: Country grouping used in Lassaletta et al. nitrogen flow studies.
region_krausmann: Regional grouping from Krausmann et al. biomass flow accounting.
region_HANPP: Regional grouping used in human appropriation of net primary production (HANPP) studies.
region_krausmann2: Alternative Krausmann regional grouping.
region_UN_sub: UN sub-regional classification (M49 sub-region).
region_UN: UN macro-regional classification (M49 region).
region_ILO1: ILO primary regional grouping.
region_ILO2: ILO secondary regional grouping.
region_ILO3: ILO tertiary regional grouping.
region_IEA: IEA regional grouping.
region_IPCC: IPCC regional grouping used in climate assessments.
region_labour: Labour-focused regional grouping.
region_labour_agg: Aggregated labour-focused regional grouping.
region_labour_mech: Labour mechanisation regional grouping.
region_test: Experimental/test regional grouping (may be incomplete).

Note

Five trailing columns containing only Excel ⁠#REF!⁠ errors in the source CSV are dropped at load time and are not part of this dataset.

Source

Compiled from FAOSTAT, UN M49, ILO, IEA, and other international statistical sources.

Examples

head(polities_cats)
head(polities_cats)

FAOSTAT/FABIO area-to-polity crosswalk

Description

Year-aware bridge from numeric reporting area_code values used by FAOSTAT/FABIO-derived WHEP data to periodized WHEP polity_code values.

Usage

polity_area_crosswalk
polity_area_crosswalk

Format

A tibble with one row per area-code/polity-period mapping. Key columns:

area_code: Numeric FAOSTAT/FABIO reporting area code.
area_name: Reporting area name.
area_iso3c: Reporting-area ISO3-like code where available.
polity_area_code: Numeric area code retained for WHEP matrix workflows.
polity_code, polity_name: Matched WHEP polity, or NA for statistical composites that are not real polities.
polity_start_year, polity_end_year: Validity interval for the matched polity.
mapping_status: "matched", "manual", "unmapped", or "not_a_reporting_area".
mapping_note: Explanation for manual or unmapped rows.

Source

Derived from polities and inst/extdata/harmonization/regions_full.csv.

Prepare production data for livestock emission calculations.

Description

Bridge between build_primary_production() output and the livestock emission functions. Maps species codes to names, converts area codes to ISO3, extracts milk and meat yields, and optionally expands herds into GLEAM cohorts.

Any extra columns present in the input (e.g., weight, diet_quality, fat_percent) are preserved and flow through to the emission functions automatically.

Usage

prepare_livestock_emissions(data, expand_cohorts = FALSE, system_shares = NULL)
prepare_livestock_emissions(data, expand_cohorts = FALSE, system_shares = NULL)

Arguments

data

A tibble from build_primary_production() with columns item_cbs_code, unit, value, and optionally year, area_code, live_anim_code.

expand_cohorts

Logical. If TRUE, distributes herds across GLEAM cohorts and production systems via calculate_cohorts_systems(). Default FALSE.

system_shares

Optional dataframe with custom system shares. Passed to calculate_cohorts_systems().

Value

A tibble with columns species, heads, iso3 (if area_code present), and optionally milk_yield_kg_day, meat_yield_t_head, cohort columns, plus all extra columns from the input.

Examples

tibble::tibble(
  item_cbs_code = 961,
  unit = "heads",
  value = 5000,
  area_code = 79L
) |>
  prepare_livestock_emissions()
tibble::tibble(
  item_cbs_code = 961,
  unit = "heads",
  value = 5000,
  area_code = 79L
) |>
  prepare_livestock_emissions()

Items with double-counting in production statistics

Description

Identifies production items that appear both as primary crop products and as harvested-area items, requiring special treatment to avoid double-counting in production and biomass accounting.

Usage

primary_double
primary_double

Format

A tibble where each row corresponds to one item pair with a double-counting relationship. It contains the following columns:

Item_area: Name of the item as it appears in harvested-area statistics (e.g., "Seed cotton, unginned").
item_prod: Name of the derived production item (e.g., "Cotton lint, ginned", "Cotton seed").
item_prod_code: Numeric FAOSTAT production code of the derived item.
Multi_type: Classification of the double-counting type:
- "Primary": The area item is the primary crop; product is a direct output.
- "Primary_area": Area is recorded under a primary aggregate crop name.
- "Multi": Multiple products share the same harvested area.
- "Multi_area": Multiple products share a recorded area aggregate.

Source

Derived from FAOSTAT production methodology documentation.

Examples

head(primary_double)
head(primary_double)

Propagate input uncertainty through a footprint.

Description

Monte Carlo propagation of extension uncertainty: perturb the extension vector with multiplicative lognormal noise (so values stay non-negative and the expected factor is one), re-run the footprint for each draw, and summarise the spread of each output cell. A point estimate with no interval is not a trustworthy result; this turns one into a distribution.

Usage

propagate_fp_uncertainty(run_fn, extensions, cov = 0.1, options = list())
propagate_fp_uncertainty(run_fn, extensions, cov = 0.1, options = list())

Arguments

run_fn

Function taking a perturbed extension vector and returning a footprint tibble with the grouping columns named in options$by plus a value column. Wrap compute_footprint() with the other arguments fixed.

extensions

Numeric vector of base extensions per sector.

cov

Coefficient of variation of the extensions: one number, or one per sector. Zero means no uncertainty.

options

Named list overriding n (draws, default 200), probs (lower/median/upper quantiles), by (grouping columns) and seed (for reproducible draws).

Value

A tibble with the by columns plus mean, sd, cv, q_low, q_med and q_high per output cell.

Examples

run_fn <- function(ext) {
  tibble::tibble(
    target_area = 1L, target_item = 10L, value = sum(ext)
  )
}
propagate_fp_uncertainty(
  run_fn,
  extensions = c(60, 40),
  cov = 0.1,
  options = list(n = 100, seed = 1, by = c("target_area", "target_item"))
)
run_fn <- function(ext) {
  tibble::tibble(
    target_area = 1L, target_item = 10L, value = sum(ext)
  )
}
propagate_fp_uncertainty(
  run_fn,
  extensions = c(60, 40),
  cov = 0.1,
  options = list(n = 100, seed = 1, by = c("target_area", "target_item"))
)

Read natural-grass productivity from an LPJmL run.

Description

Sums the natural-grass PFT net primary production bands (the ungrazed natural stand, climate-driven) into a per-cell productivity layer used to distribute grazing livestock by grass production rather than pasture area. The pinned WHEP artifact is used by default; pass run_dir to read a local finished LPJmL run, or pass productivity / productivity_path to use an already-derived custom artifact. Natural-grass NPP is exogenous to the grazing density, so it avoids the livestock to grassland_lsuha to grass-NPP circularity.

Usage

read_lpjml_grass_productivity(
  run_dir = NULL,
  years = NULL,
  first_year = 1901L,
  example = FALSE,
  productivity = NULL,
  productivity_path = NULL
)
read_lpjml_grass_productivity(
  run_dir = NULL,
  years = NULL,
  first_year = 1901L,
  example = FALSE,
  productivity = NULL,
  productivity_path = NULL
)

Arguments

run_dir

Path to the LPJmL run output directory holding pft_npp.nc. If unset, the pinned lpjml-grass-productivity artifact is used.

years

Integer vector of calendar years to read.

first_year

First calendar year of the run's output time axis.

example

If TRUE, return a small fixture instead of reading a run.

productivity

Optional already-derived natural-grass productivity tibble/data frame. Takes precedence over pinned data and run_dir.

productivity_path

Optional path to an already-derived natural-grass productivity artifact (.parquet, .csv, or .rds). Takes precedence over pinned data and run_dir.

Value

A tibble with lon, lat, year and grass_npp (gC/m2/yr).

Examples

read_lpjml_grass_productivity(example = TRUE)
read_lpjml_grass_productivity(example = TRUE)

Record provenance for a reproducible result.

Description

Capture the information needed to regenerate a result: the package version (the code), the R version, the pinned versions of the input datasets used, and a timestamp. Attach the record to an output with attach_provenance() so any number can be traced back to the exact inputs and code that produced it.

Unknown aliases are an error rather than a silent omission, so a provenance record never quietly drops an input it could not resolve.

Usage

record_provenance(
  aliases = NULL,
  inputs = whep::whep_inputs,
  recorded_at = Sys.time()
)
record_provenance(
  aliases = NULL,
  inputs = whep::whep_inputs,
  recorded_at = Sys.time()
)

Arguments

aliases

Optional character vector of input aliases to record. When NULL (default), every registered input is recorded.

inputs

Tibble of registered inputs with alias and version columns. Defaults to whep_inputs.

recorded_at

Timestamp for the record. Defaults to the current time; pass a fixed value for reproducible output.

Value

A tibble with one row per recorded input:

recorded_at: When the record was made.
whep_version: Installed package version (the code).
r_version: R version.
input_alias: Input dataset alias.
input_version: Pinned version of that input.

Examples

prov <- record_provenance(
  aliases = "bilateral_trade",
  recorded_at = as.POSIXct("2026-01-01", tz = "UTC")
)
prov
prov <- record_provenance(
  aliases = "bilateral_trade",
  recorded_at = as.POSIXct("2026-01-01", tz = "UTC")
)
prov

Redistribute available feed supply among livestock demand.

Description

Matches livestock feed demand to available feed items through a hierarchical allocation that follows the remaining-share principle to avoid exceeding availability. The redistribution path adapts to the fixed_demand column in the demand table.

Usage

redistribute_feed(feed_demand, feed_avail, options = list())
redistribute_feed(feed_demand, feed_avail, options = list())

Arguments

feed_demand

A tibble of feed demand with columns year, territory, sub_territory, livestock_category, item_cbs_code, feed_group, feed_quality, demand_dm_t, and a logical fixed_demand.

feed_avail

A tibble of feed availability with columns year, sub_territory, item_cbs_code, feed_group, feed_quality, avail_dm_t, and feed_scale.

options

A named list of allocation options. See .redistribute_feed_options() for the available entries and their defaults. Supply grass_availability (a tibble with year, territory or area_code, and grass_avail_dm_t) to bound the otherwise-unlimited pasture grass at that supply per polity-year. The grass deficit then cascades: pasture grass is capped at the ceiling, the deficit is redistributed to leftover non-grass availability in the polity (added as ⁠7_grass_deficit_substitute⁠ intake, limited by that leftover), and the residual stays as biologically-feasible underfeeding (scaling_factor < 1). Supply maintenance_share (a scalar fraction or a tibble with livestock_category and maintenance_share) to also diagnose polities pushed below maintenance; the over-stocked demand rows are attached to the result as the grass_deficit_diagnosis attribute. Set distribute_surplus = FALSE to suppress the surplus-distribution pass that pushes leftover CBS availability onto variable-demand livestock (correct for historical analyses where the CBS feed element is the realised consumption; keep TRUE, the default, for unconstrained scenario projections).

Value

A tibble of realised intake per demand row. When maintenance_share is supplied alongside grass_availability, a grass_deficit_diagnosis attribute lists demand rows underfed below maintenance.

Regional MMS distribution.

Description

Fraction of manure managed in each MMS type by region and species.

Usage

regional_mms_distribution
regional_mms_distribution

Format

A tibble with region, species, mms_type, fraction.

Source

GLEAM 3.0 / FAO statistics.

Examples

regional_mms_distribution
regional_mms_distribution

Full polity and region reference table

Description

Extended reference table covering all polities and aggregate regions, including countries, territories, and statistical composites that appear in international databases but may lack standard ISO codes.

Usage

regions_full
regions_full

Format

A tibble where each row corresponds to one polity or aggregate region. It contains the following columns (same definitions as polities_cats, minus the five trailing ⁠0...36⁠–⁠0...40⁠ artefact columns):

polity_code: Legacy current polity prefix. This is kept for compatibility with older code that expected ISO3-like values.
polity_name: Current polity, country, territory, or aggregate name.
V1: Internal row index.
code: Numeric FAOSTAT country/region code.
polity_area_code: Numeric WHEP reporting area code used in matrix workflows.
reporting_polity_code: Current periodized WHEP polity code for code.
reporting_polity_name: Current WHEP polity name for code.
reporting_polity_has_geometry: Logical flag indicating whether the current reporting polity has a polygon.
iso3c: ISO 3166-1 alpha-3 code (NA for aggregates).
FAOSTAT_name: Name used in FAOSTAT (may be "#N/A" for aggregates).
EU27: Logical EU27 membership flag.
name: Name used in external databases.
eia: EIA country identifier.
iea: IEA country identifier.
water_code: Water statistics numeric code.
water_area: Name used in water statistics.
baci: BACI trade database country code.
fish: Fisheries dataset numeric code.
region_code: Numeric regional code.
cbs: Logical CBS dataset membership flag.
fabio_code: FABIO database numeric code.
ADB_Region: Asian Development Bank region.
region: General world region.
uISO3c: UN M49 numeric code.
Lassaletta: Lassaletta et al. nitrogen study grouping.
region_krausmann: Krausmann regional grouping.
region_HANPP: HANPP study regional grouping.
region_krausmann2: Alternative Krausmann grouping.
region_UN_sub: UN M49 sub-region.
region_UN: UN M49 macro-region.
region_ILO1: ILO primary region.
region_ILO2: ILO secondary region.
region_ILO3: ILO tertiary region.
region_IEA: IEA region.
region_IPCC: IPCC region.
region_labour: Labour-focused region.
region_labour_agg: Aggregated labour region.
region_labour_mech: Labour mechanisation region.
region_test: Experimental regional grouping.

Source

Compiled from FAOSTAT, UN M49, ILO, IEA, and other international statistical sources.

Examples

head(regions_full)
head(regions_full)

Run the gridded land-use spatialization pipeline

Description

Wrapper around build_gridded_landuse() that resolves a named preset ("lpjml" or "whep") into a consistent bundle of input files, engine flags, and output paths. Use this to produce two comparable outputs from the same prepared parquet inputs: an LPJmL/LandInG-faithful run (for cell-by-cell comparison against LPJmL inputs) and the full WHEP run (all historical years, LUH2 type-aware allocation).

Presets can be combined with per-flag overrides to produce any intermediate configuration; the resolved configuration is written next to the outputs as run_metadata.yaml for traceability.

Usage

run_spatialize(
  preset = c("lpjml", "whep"),
  years = NULL,
  components = c("landuse", "livestock"),
  overrides = list(),
  paths = list()
)
run_spatialize(
  preset = c("lpjml", "whep"),
  years = NULL,
  components = c("landuse", "livestock"),
  overrides = list(),
  paths = list()
)

Arguments

preset

One of "lpjml" or "whep". Selects a default bundle of engine flags and input choices. See Presets.

years

Integer vector of years to spatialize. If NULL, the preset default is used: for "lpjml" a 10-year benchmark sequence (seq(1850L, 2020L, by = 10L)), intersected with the years available in country_areas; for "whep" all years present in country_areas.

components

Character vector selecting which engines to run. Defaults to c("landuse", "livestock"). Pass a subset to run only one (e.g. "landuse"). Unknown entries raise an error.

overrides

Named list of flags that override the preset. Unknown keys raise an error. Recognised entries:

use_type_constraint (logical): enable/disable LUH2 type-aware allocation.
aggregate_to_cft (logical, default TRUE): write a CFT-aggregated parquet alongside the crop-level output.
max_iterations, expansion_threshold: forwarded to the landuse engine.
cft_target: one of "whep" (default for preset = "whep") or "lpjml" (default for preset = "lpjml"). Selects which column of cft_mapping.csv drives CFT aggregation: cft_name (granular 33-class WHEP taxonomy) or cft_lpjml (12 LPJmL crop CFTs + single others bucket).

paths

Named list of filesystem paths. Recognised entries:

l_files_dir: path to the L_files root, for local prepared inputs.
input_dir: directory holding the prepared input parquets. If NULL and l_files_dir is unset, the pinned WHEP spatialization inputs are used.
out_dir: output directory. If NULL, defaults to ⁠<l_files_dir>/whep/spatialize/<preset>⁠ when l_files_dir is supplied, otherwise to a session temporary directory (suffixed with ⁠_custom⁠ when overrides is non-empty). Created if missing.

Value

Invisibly, a named list with preset, resolved config, years, out_dir, and output_paths.

Presets

lpjml: LandInG-faithful configuration: no LUH2 type-aware allocation (use_type_constraint = FALSE) and a short default year sample suited to comparison against LPJmL inputs.
whep: Full WHEP configuration: LUH2 type-aware allocation (use_type_constraint = TRUE) and the full historical year range present in country_areas.

Inputs read from `input_dir`

Landuse (components contains "landuse"):

country_areas.parquet
crop_patterns.parquet
gridded_cropland.parquet
country_grid.parquet
type_cropland.parquet (required when use_type_constraint = TRUE).

Livestock (components contains "livestock"):

livestock_country_data.parquet
gridded_pasture.parquet
gridded_cropland.parquet, country_grid.parquet
manure_pattern.parquet (optional, enables manure-intensity weighting if present).
livestock_mapping.csv from the installed package.

Outputs written to `out_dir`

gridded_landuse_crops.parquet — crop-level output.
gridded_landuse.parquet — CFT-aggregated output (when aggregate_to_cft = TRUE).
gridded_livestock_emissions.parquet — gridded livestock stocks and emissions (when livestock component selected).
run_metadata.yaml — resolved preset, components, flags, years, timestamp, and package version.

Examples

# Dispatch to the engine with a filtered year range (offline
# example; normally called against prepared parquet inputs).
country_areas <- tibble::tribble(
  ~year, ~area_code, ~item_prod_code, ~harvested_area_ha,
  1999L,         1L,             15L,                500,
  2000L,         1L,             15L,               1000
)
crop_patterns <- tibble::tribble(
  ~lon,  ~lat, ~item_prod_code, ~harvest_fraction,
   0.25, 50.25,             15L,               0.6,
   0.75, 50.25,             15L,               0.4
)
gridded_cropland <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~cropland_ha,
   0.25, 50.25, 1999L,          800,
   0.75, 50.25, 1999L,          500,
   0.25, 50.25, 2000L,          800,
   0.75, 50.25, 2000L,          500
)
country_grid <- tibble::tribble(
  ~lon,  ~lat, ~area_code,
   0.25, 50.25,         1L,
   0.75, 50.25,         1L
)
build_gridded_landuse(
  country_areas, crop_patterns, gridded_cropland, country_grid,
  config = list(years = 2000L)
)
# Dispatch to the engine with a filtered year range (offline
# example; normally called against prepared parquet inputs).
country_areas <- tibble::tribble(
  ~year, ~area_code, ~item_prod_code, ~harvested_area_ha,
  1999L,         1L,             15L,                500,
  2000L,         1L,             15L,               1000
)
crop_patterns <- tibble::tribble(
  ~lon,  ~lat, ~item_prod_code, ~harvest_fraction,
   0.25, 50.25,             15L,               0.6,
   0.75, 50.25,             15L,               0.4
)
gridded_cropland <- tibble::tribble(
  ~lon,  ~lat,  ~year, ~cropland_ha,
   0.25, 50.25, 1999L,          800,
   0.75, 50.25, 1999L,          500,
   0.25, 50.25, 2000L,          800,
   0.75, 50.25, 2000L,          500
)
country_grid <- tibble::tribble(
  ~lon,  ~lat, ~area_code,
   0.25, 50.25,         1L,
   0.75, 50.25,         1L
)
build_gridded_landuse(
  country_areas, crop_patterns, gridded_cropland, country_grid,
  config = list(years = 2000L)
)

Smil (2001) global synthetic nitrogen production, 1913-2000

Description

Global synthetic-nitrogen production anchors from Smil (2001) "Enriching the Earth", Tables 5.2 and 5.3, cross-checked with Smil (2002) Ambio 31:126-131. Anchor years span 1913 (first commercial Haber-Bosch plant at BASF Oppau) to 2000. Used by prepare_nitrogen_inputs() to backcast country-level synthetic N for the pre-FAOSTAT period (years before 1961): the temporal shape is taken from this global series and downscaled to each country using its 1961-1965 share of global FAOSTAT synthetic N.

Pre-1913 values are treated as zero by the consumer and are not stored here.

Usage

smil_2001_synthetic_n_global
smil_2001_synthetic_n_global

Format

A tibble with one row per anchor year:

year: Integer anchor year (1913, 1920, 1925, ..., 2000).
global_kt_n: Global synthetic-N production in kt N.

Source

Smil, V. (2001) Enriching the Earth: Fritz Haber, Carl Bosch, and the Transformation of World Food Production, MIT Press. Tables 5.2 and 5.3.

Examples

head(smil_2001_synthetic_n_global)
head(smil_2001_synthetic_n_global)

Split livestock excretion across manure-management systems.

Description

Splits the excreted nitrogen, carbon and volatile solids from estimate_n_excretion() across manure-management systems (MMS), separating the in-situ grazing stream (pasture/range/paddock, deposited where it falls) from the collected/housed streams routed to storage. The split conserves mass: the per-species MMS shares sum to one.

Usage

split_manure_management(excretion, options = list())
split_manure_management(excretion, options = list())

Arguments

excretion

A tibble from estimate_n_excretion() with year, territory, sub_territory, livestock_category, n_excretion, c_excretion and vs_excretion.

options

A named list. mms_source selects the MMS-share table ("regional_default", the global IPCC/GLEAM default in regional_mms_distribution).

Value

A tibble with one row per ⁠year x territory x sub_territory x livestock_category x mms_type⁠, plus species_gen, loss_category, stream ("grazing" or "collected"), n_stream, c_stream, vs_stream and method_mms.

Examples

excretion <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~n_excretion, ~c_excretion, ~vs_excretion,
  2020L, "ES", NA, "Cattle_milk", 100, 1900, 60,
  2020L, "ES", NA, "Pigs", 30, 270, 20
)
split_manure_management(excretion)
excretion <- tibble::tribble(
  ~year, ~territory, ~sub_territory, ~livestock_category,
  ~n_excretion, ~c_excretion, ~vs_excretion,
  2020L, "ES", NA, "Cattle_milk", 100, 1900, 60,
  2020L, "ES", NA, "Pigs", 30, 270, 20
)
split_manure_management(excretion)

Summarise a footprint conservation report.

Description

Roll up the per-origin report from check_footprint_conservation() into a single global verdict.

Usage

summarise_conservation(report)
summarise_conservation(report)

Arguments

report

Tibble returned by check_footprint_conservation().

Value

A one-row tibble with n_origin, n_ok, n_flagged, n_dropped, n_under_traced, n_over_traced, total_direct, total_embodied and global_rel_discrepancy.

Examples

z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
) |>
  check_footprint_conservation(extensions, labels, x_vec) |>
  summarise_conservation()
z_mat <- matrix(c(0, 5, 10, 0), nrow = 2)
x_vec <- c(100, 200)
y_mat <- matrix(c(85, 195), ncol = 1)
extensions <- c(50, 30)
labels <- tibble::tibble(
  area_code = c(1L, 1L),
  item_cbs_code = c(1L, 2L)
)
compute_footprint(
  x_vec = x_vec, y_mat = y_mat, extensions = extensions,
  labels = labels, z_mat = z_mat
) |>
  check_footprint_conservation(extensions, labels, x_vec) |>
  summarise_conservation()

Summarise biological nitrogen fixation results.

Description

Aggregates the per-row BNF components into group totals, shares and mean modifiers.

Usage

summarize_bnf(x, group_by = "item_prod_code")
summarize_bnf(x, group_by = "item_prod_code")

Arguments

x

A tibble with crop_bnf_t, weed_bnf_t, nonsymbiotic_bnf_t and bnf_t (the output of calculate_bnf()).

group_by

Character vector of grouping columns (default "item_prod_code"); use NULL for an overall summary.

Value

A tibble with per-group counts, BNF totals, component percentages and mean environmental factors.

Examples

tibble::tibble(
  item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5,
  weed_npp_n_t = 4, land_use = "Cropland", legumes_seeded = 0,
  seeded_cover_crop_share = 0, area_ha = 40
) |>
  calculate_bnf() |>
  summarize_bnf()
tibble::tibble(
  item_prod_code = "176", crop_npp_n_t = 10, product_n_t = 5,
  weed_npp_n_t = 4, land_use = "Cropland", legumes_seeded = 0,
  seeded_cover_crop_share = 0, area_ha = 40
) |>
  calculate_bnf() |>
  summarize_bnf()

Temperature adjustment factors for NEm.

Description

Adjustment multipliers for net energy maintenance under cold stress, thermoneutral, and heat stress conditions.

Usage

temperature_adjustment
temperature_adjustment

Format

A tibble with temp_range, temp_min, temp_max, adjustment_factor.

Source

NRC 2001; IPCC 2019.

Examples

temperature_adjustment
temperature_adjustment

Uncertainty ranges for emission parameters.

Description

Lower and upper multipliers for key emission parameters (Ym, MCF, Bo, EF_N2O, Nex).

Usage

uncertainty_ranges
uncertainty_ranges

Format

A tibble with parameter, lower_mult, upper_mult, distribution.

Source

IPCC 2019 Refinement, Vol 4, Ch 10.

Examples

uncertainty_ranges
uncertainty_ranges

Clear the build pipeline cache

Description

Removes cached results from build_primary_production(), build_commodity_balances(), and build_processing_coefs() so that the next call rebuilds from scratch.

Usage

whep_clear_cache()
whep_clear_cache()

Value

Invisible NULL.

Examples

whep_clear_cache()
whep_clear_cache()

Read a WHEP coefficient table.

Description

Reads one of the coefficient tables shipped as CSV under inst/extdata/coefs. These tables are small and versioned inside the package (read at runtime, not stored remotely), so no download is needed.

Usage

whep_coef_table(name)
whep_coef_table(name)

Arguments

name

Coefficient table name (the file stem), for example "bio_coefs" or "ipcc_residue_coefs".

Value

A tibble with the coefficient table.

Examples

whep_coef_table("residue_feed_fraction")
whep_coef_table("residue_feed_fraction")

External inputs

Description

The information needed for accessing external datasets used as inputs in our modeling.

Usage

whep_inputs
whep_inputs

Format

A tibble where each row corresponds to one external input dataset. It contains the following columns:

alias: An internal name used to refer to this dataset, which is the expected name when trying to get the dataset with whep_read_file().
board_url: The public static URL where the data is found, following the concept of a board from the pins package, which is what we use for storing these input datasets.
version: The specific version of the dataset, as defined by the pins package. The version is a string similar to "20250714T123343Z-114b5". This version is the one used by default if no version is specified when calling whep_read_file(). If you want to use a different one, you can find the available versions of a file by using whep_list_file_versions().

Source

Created by the package authors.

Input file versions

Description

Lists all existing versions of an input file from whep_inputs.

Usage

whep_list_file_versions(file_alias)
whep_list_file_versions(file_alias)

Arguments

file_alias

Internal name of the requested file. You can find the possible values in the whep_inputs dataset.

Value

A tibble where each row is a version. For details about its format, see pins::pin_versions().

Examples

whep_list_file_versions("read_example")
whep_list_file_versions("read_example")

Download, cache and read files

Description

Used to fetch input files that are needed for the package's functions and that were built in external sources and are too large to include directly. This is a public function for transparency purposes, so that users can inspect the original inputs of this package that were not directly processed here.

If the requested file doesn't exist locally, it is downloaded from a public link and cached before reading it. This is all implemented using the pins package. It supports multiple file formats and file versioning.

Usage

whep_read_file(file_alias, type = "parquet", version = NULL)
whep_read_file(file_alias, type = "parquet", version = NULL)

Arguments

file_alias

Internal name of the requested file. You can find the possible values in the alias column of the whep_inputs dataset.

type

The extension of the file that must be read. Possible values:

parquet: This is the default value for code efficiency reasons.
csv: Mainly available for those who want a more human-readable option. If the parquet version is available, this is useless because this function already returns the dataset in an R object, so the origin is irrelevant, and parquet is read faster.

Saving each file in both formats is for transparency and accessibility purposes, e.g., having to share the data with non-programmers who can easily import a CSV into a spreadsheet. You will most likely never have to set this option manually unless for some reason a file could not be supplied in e.g. parquet format but was in another one.

version

The version of the file that must be read. Possible values:

NULL: This is the default value. A frozen version is chosen to make the code reproducible when the file has a registry version. Each release will have its own frozen versions. The version is the string that can be found in whep_inputs in the version column. A blank registry version requests the latest board version.
"latest": This overrides the frozen version and instead fetches the latest one that is available. This might or might not match the frozen version.
Other: A specific version can also be used. For more details read the version column information from whep_inputs.

Value

A tibble with the dataset. Some information about each dataset can be found in the code where it's used as input for further processing.

Examples

whep_read_file("read_example")
whep_read_file("read_example", type = "parquet", version = "latest")
whep_read_file(
  "read_example",
  type = "csv",
  version = "20250721T152646Z-ce61b"
)
whep_read_file("read_example")
whep_read_file("read_example", type = "parquet", version = "latest")
whep_read_file(
  "read_example",
  type = "csv",
  version = "20250721T152646Z-ce61b"
)

Package 'whep'

Help Index

Get area codes from area names

Description

Usage

Arguments

Value

Examples

Get area names from area codes

Description

Usage

Arguments

Value

Examples

Add a final-demand product-area stage to footprints.

Description

Usage

Arguments

Value

Get commodity balance sheet item codes from item names

Description

Usage

Arguments

Value

Examples

Get commodity balance sheet item names from item codes

Description

Usage

Arguments

Value

Examples

Get production item codes from item names

Description

Usage

Arguments

Value

Examples

Get production item names from item codes

Description

Usage

Arguments

Value

Examples

Add WHEP polity codes to a table

Description

Usage

Arguments

Value

Aggregate gridded grass availability to polity totals.

Description

Usage

Arguments

Value

Examples

Align an extension table to input-output sector labels.

Description

Usage

Arguments

Value

Examples

Allocate grazing land forward to livestock products.

Description

Usage

Arguments

Value

Examples

Allocate field-available manure to cropland and grassland by crop.

Description

Usage

Arguments

Value

Examples

Spill surplus manure to neighbouring cells with spare capacity.

Description

Usage

Arguments

Value

Examples

Animal codes and classifications

Description