Skip to contents

[Experimental]

cat_add_data() adds data to an elic_cat object from different sources.

Usage

cat_add_data(
  x,
  data_source,
  topic,
  ...,
  sep = ",",
  sheet = 1,
  overwrite = FALSE,
  verbose = TRUE
)

Arguments

x

an object of class elic_cat.

data_source

either a data.frame or tibble, a string with the path to a csv or xlsx file, or anything accepted by the read_sheet function.

topic

character string that indicates the mechanism to which the data belongs.

...

Unused arguments, included only for future extensions of the function.

sep

character used as field separator, used only when data_source is a path to a csv file.

sheet

integer or character to select the sheet. The sheet can be referenced by its position with a number or by its name with a string. Used only when data_source is a path to a xlsx file or when data are imported from Google Sheets.

overwrite

logical, whether to overwrite existing data already added to the elic_cont object.

verbose

logical, if TRUE it prints informative messages.

Value

The provided object of class elic_cat updated with the data.

Data format

For each topic, data are expected to have five columns, built as follows:

  • The first column of the data should hold the names of the experts. The name of each expert should be repeated as many times as the number of categories and options. (i.e. each expert should appear \(number\ of\ categories \cdot number\ of\ options\) times).

  • The second column should be the names of the categories considered in the elicitation. Each block of categories should be repeated as many times as the number of options considered.

  • The third column should hold the names of the options considered in the study. The name of each option should be repeated as many times as the number of categories considered.

  • The fourth column should be the experts confidence in their own estimates (given in percent). Experts should estimate how confident they are in their estimates for each block of categories and for each option. Therefore, expert confidence estimates should be repeated as many times as the number of categories of impact considered.

  • The final column should be the estimates of each expert for each option and category.

The name of the columns is not important, cat_add_data() will overwrite them according to the following convention:

The first column will be renamed id, the second column category, the third column option, the fourth column confidence, and the fifth column estimate.

Here is an example of data correctly formatted for an elicitation with five categories and two options (only one expert is shown):

name         category       option      confidence      estimate
----------------------------------------------------------------
expert 1     category 1     option 1            15          0.08
expert 1     category 2     option 1            15          0
expert 1     category 3     option 1            15          0.84
expert 1     category 4     option 1            15          0.02
expert 1     category 5     option 1            15          0.06
expert 1     category 1     option 2            35          0.02
expert 1     category 2     option 2            35          0.11
expert 1     category 3     option 2            35          0.19
expert 1     category 4     option 2            35          0.02
expert 1     category 5     option 2            35          0.66

Data cleaning

When data are added to the elic_cat object, first names are standardised by converting capital letters to lower case, and by removing any whitespaces and punctuation. Then, data are anonymised by converting names to short sha1 hashes. In this way, sensible information collected during the elicitation process never reaches the elic_cat object.

See also

Other cat data helpers: cat_get_data(), cat_sample_data(), cat_start(), summary.cat_sample()

Author

Sergio Vignali and Maude Vernet

Examples

# Create the elic_cat object for an elicitation process with three topics,
# four options, five categories and a maximum of six experts per topic
my_categories <- c("category_1", "category_2", "category_3",
                   "category_4", "category_5")
my_options <- c("option_1", "option_2", "option_3", "option_4")
my_topics <- c("topic_1", "topic_2", "topic_3")
x <- cat_start(categories = my_categories,
               options = my_options,
               experts = 6,
               topics = my_topics)
#>  <elic_cat> object for "Elicitation" correctly initialised

# Add data for the three topics from a data.frame. Notice that the three
# commands can be piped
my_elicit <- cat_add_data(x,
                          data_source = topic_1,
                          topic = "topic_1") |>
  cat_add_data(data_source = topic_2, topic = "topic_2") |>
  cat_add_data(data_source = topic_3, topic = "topic_3")
#>  Data added to Topic "topic_1" from "data.frame"
#>  Data added to Topic "topic_2" from "data.frame"
#>  Data added to Topic "topic_3" from "data.frame"
my_elicit
#> 
#> ── Elicitation ──
#> 
#> • Categories: "category_1", "category_2", "category_3", "category_4", and
#> "category_5"
#> • Options: "option_1", "option_2", "option_3", and "option_4"
#> • Number of experts: 6
#> • Topics: "topic_1", "topic_2", and "topic_3"
#> • Data available for topics "topic_1", "topic_2", and "topic_3"

# Add data for the first and second round from a csv file
files <- list.files(path = system.file("extdata", package = "elicitr"),
                    pattern = "topic_",
                    full.names = TRUE)
my_elicit <- cat_add_data(x,
                          data_source = files[1],
                          topic = "topic_1") |>
  cat_add_data(data_source = files[2], topic = "topic_2") |>
  cat_add_data(data_source = files[3], topic = "topic_3")
#>  Data added to Topic "topic_1" from "csv file"
#>  Data added to Topic "topic_2" from "csv file"
#>  Data added to Topic "topic_3" from "csv file"
my_elicit
#> 
#> ── Elicitation ──
#> 
#> • Categories: "category_1", "category_2", "category_3", "category_4", and
#> "category_5"
#> • Options: "option_1", "option_2", "option_3", and "option_4"
#> • Number of experts: 6
#> • Topics: "topic_1", "topic_2", and "topic_3"
#> • Data available for topics "topic_1", "topic_2", and "topic_3"

# Add data for the first and second round from a xlsx file with three sheets
file <- list.files(path = system.file("extdata", package = "elicitr"),
                   pattern = "topics",
                   full.names = TRUE)
# Using the sheet index
my_elicit <- cat_add_data(x,
                          data_source = file,
                          sheet = 1,
                          topic = "topic_1") |>
  cat_add_data(data_source = file,
               sheet = 2,
               topic = "topic_2") |>
  cat_add_data(data_source = file,
               sheet = 3,
               topic = "topic_3")
#>  Data added to Topic "topic_1" from "xlsx file"
#>  Data added to Topic "topic_2" from "xlsx file"
#>  Data added to Topic "topic_3" from "xlsx file"
my_elicit
#> 
#> ── Elicitation ──
#> 
#> • Categories: "category_1", "category_2", "category_3", "category_4", and
#> "category_5"
#> • Options: "option_1", "option_2", "option_3", and "option_4"
#> • Number of experts: 6
#> • Topics: "topic_1", "topic_2", and "topic_3"
#> • Data available for topics "topic_1", "topic_2", and "topic_3"
# Using the sheet name
my_elicit <- cat_add_data(x,
                          data_source = file,
                          sheet = "Topic 1",
                          topic = "topic_1") |>
  cat_add_data(data_source = file,
               sheet = "Topic 2",
               topic = "topic_2") |>
  cat_add_data(data_source = file,
               sheet = "Topic 3",
               topic = "topic_3")
#>  Data added to Topic "topic_1" from "xlsx file"
#>  Data added to Topic "topic_2" from "xlsx file"
#>  Data added to Topic "topic_3" from "xlsx file"
my_elicit
#> 
#> ── Elicitation ──
#> 
#> • Categories: "category_1", "category_2", "category_3", "category_4", and
#> "category_5"
#> • Options: "option_1", "option_2", "option_3", and "option_4"
#> • Number of experts: 6
#> • Topics: "topic_1", "topic_2", and "topic_3"
#> • Data available for topics "topic_1", "topic_2", and "topic_3"

if (FALSE) { # interactive()
# Add data for the first and second round from Google Sheets
googlesheets4::gs4_deauth()
gs <- "18VHeHB89P1s-6banaVoqOP-ggFmQZYx-z_31nMffAb8"
# Using the sheet index
my_elicit <- cat_add_data(x,
                          data_source = gs,
                          sheet = 1,
                          topic = "topic_1") |>
  cat_add_data(data_source = gs,
               sheet = 2,
               topic = "topic_2") |>
  cat_add_data(data_source = gs,
               sheet = 3,
               topic = "topic_3")
my_elicit
# Using the sheet name
my_elicit <- cat_add_data(x, data_source = gs,
                          sheet = "Topic 1",
                          topic = "topic_1") |>
  cat_add_data(data_source = gs,
               sheet = "Topic 2",
               topic = "topic_2") |>
  cat_add_data(data_source = gs,
               sheet = "Topic 3",
               topic = "topic_3")
my_elicit
}