--- title: "Getting Started with tsg" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Getting Started with tsg} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(tsg) ``` `tsg` helps you turn a dataset into a publication-ready summary table — a count of each category, a two-way comparison, or anything in between — and save it to Excel in one short pipeline. No manual formatting required. This vignette walks through the most common tasks: 1. Counting values in a column 2. Comparing two columns against each other 3. Breaking results down by a group 4. Saving your table to a file --- ## The sample dataset All examples use `person_record`, a sample household survey dataset included with the package. It contains one row per respondent and records their sex, age, marital status, employment status, and ratings on six functional difficulty questions (seeing, hearing, walking, etc.). ```{r} head(person_record) ``` --- ## Count values in a column `generate_frequency()` counts how many respondents fall into each category of a column, and adds a percentage column automatically. ```{r} person_record |> generate_frequency(sex) ``` The result is a table with one row per category, plus a **Total** row at the bottom. ### Count several columns at once Pass more than one column name and you get back a list of tables — one per column. ```{r} person_record |> generate_frequency(sex, age, marital_status) ``` ### Show only the top categories Use `top_n` to keep only the most frequent categories. By default, everything else is rolled up into an "Others" row. ```{r} person_record |> generate_frequency( marital_status, top_n = 3 ) ``` Set `top_n_only = TRUE` to drop the "Others" row entirely and show only the top results. ```{r} person_record |> generate_frequency( marital_status, top_n = 3, top_n_only = TRUE ) ``` ### Control how rows are sorted By default, rows are sorted by frequency (most common first). Set `sort_value = FALSE` to sort by the category values instead — useful for ordered variables like `age`. ```{r} person_record |> generate_frequency(age, sort_value = FALSE) ``` When counting multiple columns at once, you can exclude specific columns from sorting with `sort_except`: ```{r} person_record |> generate_frequency( sex, age, marital_status, sort_except = "age" # keep age in its natural order ) ``` ### Include or exclude missing values By default, missing values (`NA`) are counted and shown as a separate row. Set `include_na = FALSE` to leave them out. ```{r} person_record |> generate_frequency(employed, include_na = TRUE) # default person_record |> generate_frequency(employed, include_na = FALSE) ``` ### Combine similar columns into one table If several columns share the same set of categories (like the six functional difficulty columns in `person_record`), you can stack them into a single table with `collapse_list = TRUE`. ```{r} person_record |> generate_frequency( seeing, hearing, walking, remembering, self_caring, communicating, collapse_list = TRUE ) ``` ### Other options - **Running totals** — add cumulative counts and percentages with `add_cumulative = TRUE` and `add_cumulative_percent = TRUE`. - **Proportions instead of percentages** — use `as_proportion = TRUE` to get values between 0 and 1. - **Total row at the top** — set `position_total = "top"` to move the Total row above the data rows. - **Custom total label** — use `label_total` to rename the "Total" row. ```{r} person_record |> generate_frequency( sex, add_cumulative = TRUE, add_cumulative_percent = TRUE ) ``` --- ## Compare two columns in a grid (cross-tabulation) `generate_crosstab()` builds a two-way table: the first column you name goes in the rows, and the second becomes column groups. You get a count and a percentage for each combination. ```{r} person_record |> generate_crosstab(marital_status, sex) ``` > **Tip:** If you name only one column, `generate_crosstab()` automatically falls back to a frequency table for that column. ### Percentages by column instead of by row By default, percentages are calculated across each row (what share of each marital-status group is male vs. female?). Set `percent_by_column = TRUE` to flip this — percentages are then calculated down each column (what share of males is in each marital-status group?). ```{r} person_record |> generate_crosstab( marital_status, sex, percent_by_column = TRUE ) ``` ### Cross-tabulate against several columns at once Pass more than one column after the row variable and you get a list of cross-tabs — one per column. ```{r} person_record |> generate_crosstab( sex, seeing, hearing, walking, remembering, self_caring, communicating ) ``` ### Other options The same options available for `generate_frequency()` also work in `generate_crosstab()` — `as_proportion`, `position_total`, `include_na`, `label_total`, and more. ```{r} person_record |> generate_crosstab( marital_status, sex, position_total = "top" ) ``` --- ## Break down results by a group Pipe a `group_by()` call before either function to calculate counts separately for each group. The result is a single merged table with the group labels embedded in the category column. ```{r} person_record |> dplyr::group_by(sex) |> generate_frequency(marital_status) ``` The same works for cross-tabulations: ```{r} person_record |> dplyr::group_by(sex) |> generate_crosstab(marital_status, employed) ``` > **Want separate tables per group?** See the [Grouped Tables and Side-by-Side Comparisons](advanced.html) vignette for `group_as_list` and `group_as_hierarchy`. --- ## Save your table to a file Once you have a table, pipe it to `write_xlsx()` to save it as an Excel file. ```{r, eval=FALSE} person_record |> generate_frequency(sex) |> write_xlsx(path = "table-sex.xlsx") ``` ### Add a title and notes Chain `add_table_title()`, `add_table_subtitle()`, `add_source_note()`, and `add_footnote()` before saving to attach metadata that appears as styled rows above and below the table. ```{r, eval=FALSE} person_record |> generate_crosstab(marital_status, sex) |> add_table_title("Marital Status by Sex") |> add_table_subtitle("National Sample Survey, 2024") |> add_source_note("Source: person_record dataset.") |> add_footnote("Missing values are excluded from the denominator.") |> write_xlsx(path = "table-marital-sex.xlsx") ``` ### Change the look of your table Pass a built-in style with `get_tsg_facade()` to quickly change the visual appearance of your exported table. The package ships with two built-in styles: `"default"` (clean and neutral) and `"yolo"` (bolder colours). ```{r, eval=FALSE} person_record |> generate_frequency(sex) |> write_xlsx( path = "table-sex-styled.xlsx", facade = get_tsg_facade("yolo") ) ``` For fine-grained control — changing specific fonts, colours, or cell sizes — see the [Customizing How Your Tables Look](facade.html) vignette. ### Save to other formats `tsg` can also save to HTML, PDF, and Word. The API is identical — just change the function name. ```{r, eval=FALSE} tbl <- person_record |> generate_crosstab(marital_status, sex) |> add_table_title("Marital Status by Sex") |> add_source_note("Source: person_record dataset") write_xlsx(tbl, path = "table.xlsx") # Excel write_html(tbl, path = "table.html") # HTML (requires the gt package) write_pdf(tbl, path = "table.pdf") # PDF (requires gt + webshot2) write_docx(tbl, path = "table.docx") # Word (requires officer + flextable) ``` For a full guide to each format — including multi-sheet workbooks and managing metadata for large reports — see the [Saving and Sharing Your Tables](output-formats.html) vignette.