5 min read

Date & Time in R - Parse Date & Time

Introduction

This is the sixth tutorial in the series Handling Date & Time in R. In this tutorial, we will learn to parse date and time in R.

Resources

Below are the links to all the resources related to this tutorial:

new courses ad

Parse Date & Time

While creating date-time objects, we specified different formats using the conversion specification but most often you will not create date/time and instead deal with data that comes your way from a database or API or colleague/collaborator. In such cases, we need to be able to parse date/time from the data provided to us. In this section, we will focus on parsing date/time from character data. Both base R and the lubridate package offer functions to parse date and time and we will explore a few of them in this section. We will initially use functions from base R and later on explore those from lubridate which will give us an opportunity to compare and contrast. It will also allow us to choose the functions based on the data we are dealing with.

strptime() will convert character data to POSIXlt. You will use this when converting from character data to date/time. On the other hand, if you want to convert date/time to character data, use any of the following:

  • strftime()
  • format()
  • as.character()

The above functions will convert POSIXct/POSIXlt to character. Let us start with a simple example. The data we have been supplied has date/time as character data and in the format YYYYMMDD i.e. nothing separates the year, month and date from each other. We will use strptime() to convert this to an object of class POSIXlt.

rel_date <- strptime("20191212", format = "%Y%m%d")
class(rel_date)
## [1] "POSIXlt" "POSIXt"

If you have a basic knowledge of conversion specifications, you can use strptime() to convert character data to POSIXlt. Let us quickly explore the functions to convert date/time to character data before moving on to the functions from lubridate.

rel_date_strf <- strftime(rel_date)
class(rel_date_strf)
## [1] "character"
rel_date_format <- format(rel_date)
class(rel_date_format)
## [1] "character"
rel_date_char <- as.character(rel_date)
class(rel_date_char)
## [1] "character"

As you can see, all the 3 functions converted date/time to character. Time to move on and explore the lubridate package. We will start with an example in which the release date is formatted in 3 different ways but they have one thing in common i.e. the order in which the components appear. In all the 3 formats, the year is followed by the month and then the date.

To parse the release date, we will use parse_date_time() from lubridate which parses the input into POSIXct objects.

release_date <- c("19-12-12", "20191212", "19-12 12")
parse_date_time(release_date, "ymd")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"
parse_date_time(release_date, "y m d")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"
parse_date_time(release_date, "%y%m%d")
## [1] "2019-12-12 UTC" "2019-12-12 UTC" "2019-12-12 UTC"

Try to use strptime() in the above example and see what happens. Now, let us look at another data set.

release_date <- c("19-07-05", "2019-07-05", "05-07-2019")

What happens in the below case? The same date appears in multiple formats. How do we parse them? parse_date_time() allows us to specify mutiple date-time formats. Let us first map the dates to their formats.

Date Specification
19-07-05 ymd
2019-07-05 ymd
05-07-2019 dmy

The above specifications can be supplied as a character vector.

parse_date_time(release_date, c("ymd", "ymd", "dmy"))
## [1] "2019-07-05 UTC" "2019-07-05 UTC" "2019-07-05 UTC"

Great! We have used both strptime() and parse_date_time() now. Can you tell what differentiates parse_date_time() when compared to strptime()? We summarize it in the points below:

  • no need to include % prefix or separator
  • specify several date/time formats

There are other helper functions that can be used to

  • parse dates with only year, month, day components
  • parse dates with year, month, day, hour, minute, seconds components
  • parse dates with only hour, minute, second components

and are explored in the below examples.

# year/month/date
ymd("2019-12-12")
## [1] "2019-12-12"
# year/month/date
ymd("19/12/12")
## [1] "2019-12-12"
# date/month/year
dmy(121219)
## [1] "2019-12-12"
# year/month/date/hour/minute/second
ymd_hms(191212080503)
## [1] "2019-12-12 08:05:03 UTC"
# hour/minute/second
hms("8, 5, 3")
## [1] "8H 5M 3S"
# hour/minute/second
hms("08:05:03")
## [1] "8H 5M 3S"
# minute/second
ms("5,3")
## [1] "5M 3S"
# hour/minute
hm("8, 5")
## [1] "8H 5M 0S"

Note, in a couple of cases where the components are not separated by /, - or space, we have not enclosed the values in quotes.

Your Turn

Below, we have specified July 5th, 2019 in different ways. Parse the dates using strptime() or parse_date_time() or any other helper function.

  • July-05-19
  • JUL-05-19
  • 05.07.19
  • 5-July 2019
  • July 5th, 2019
  • July 05, 2019
  • 2019-July- 05
  • 05/07/2019
  • 07/05/2019
  • 7/5/2019
  • 07/5/19
  • 2019-07-05

*As the reader of this blog, you are our most important critic and commentator. We value your opinion and want to know what we are doing right, what we could do better, what areas you would like to see us publish in, and any other words of wisdom you are willing to pass our way.

We welcome your comments. You can email to let us know what you did or did not like about our blog as well as what we can do to make our post better.*

Email: