Convert the SubRip file format to a tabular data frame of times and text.
read_srt(path, collapse = "\n")
A path or connection to an .srt
file.
The character with which to separate subtitle lines.
A data frame of subtitles.
The SubRip format is a newline-separated, non-tabular text file with groups of subtitle text separated by a newline character and preceded by an index and a timestamp string containing the length of the spoken subtitle text. These components (index, time, text) can be parsed individually and combined into a data frame of subtitle groups.
# read linear text to tabular data
read_srt(srt_example(), collapse = " ")
#> # A tibble: 2,268 × 4
#> n start end subtitle
#> <int> <dbl> <dbl> <chr>
#> 1 1 85.2 88.0 I owe everything to George Bailey.
#> 2 2 88.4 90.3 Help him, dear Father.
#> 3 3 90.7 93.7 Joseph, Jesus and Mary,
#> 4 4 93.8 96.4 help my friend Mr. Bailey.
#> 5 5 96.9 99.5 Help my son George tonight.
#> 6 6 100. 102. He never thinks about himself, God.
#> 7 7 102. 104. That's why he's in trouble.
#> 8 8 104. 105. George is a good guy.
#> 9 9 106. 108. Give him a break, God.
#> 10 10 108. 110. I love him, dear Lord.
#> # ℹ 2,258 more rows