WORKING WITH DATE VARIABLES
This page lists commands you may find useful when working with date variables in STATA. Commands include extracting different date components (such as day, month, year), converting from one date frequency to another (e.g. daily to monthly date variable), and some other related commands.
The post also gives a few examples that are helpful in converting variables to proper STATA dates, which usually happens after importing data from other software such as Microsoft Excel.
EXTRACTING DATE COMPONENTS
Assuming you have a variable in daily date format, use commands below to extract different components of the date. Note that dow stands for the day of the week (for its coding please use “help dow”). The remaining components should be straightforward.
gen day = day(date) gen month = month(date) gen year = year(date) gen week = week(date) gen quarter = quarter(date) gen dow = dow(date)
BUILDING DATE VARIABLES OF DIFFERENT FREQUENCIES
Once you have different date components, you can build different variations of the date variable, each corresponding to a different frequency. The commands that are used are ym, yq, and yw.
gen date_monthly = ym(year,month) format %tm date_monthly gen date_quarterly = yq(year,quarter) format %tq date_quarterly gen date_weekly = yw(year,week) format %tw date_weekly
REVERTING TO DAILY DATES
Sometimes you may want to do the reverse of the above, e.g. to go from a monthly date variable to a standard daily date. However, beware that when you convert the 1986m12 to a daily date you will get the first day of December 1986. Similarly, when you convert the 1986w52 to a daily date you will get first day of the 52nd week of 1986.
While you may think “why would I need this?”, hold on with the thought just for a moment. Please see the last two lines of the code below, where by subtracting 1 from the daily date variable you can get the last day of the previous month, or, you can get the last day of the current month (irrelevant of the number of days in a month). Your creativity is the limit!
gen date_1 = dofm(date_monthly) gen date_2 = dofq(date_quarterly) gen date_3 = dofw(date_weekly) gen date_a = dofm(date_monthly) - 1 gen date_b = dofm(date_monthly + 1) - 1 format date_? %td
COMBINING DIFFERENT DATE COMMANDS
Also note that the commands above can be combined, i.e. used in a single line of code rather then doing it step by step. The code below will extract a monthly date variable directly from a standard daily date using commands ym, year, month in one single line of code.
gen date_monthly = ym(year(date),month(date)) format %tm date_monthly
FIXING DATE VARIABLES
All commands above assumed you already have a proper daily date variable. What I mean by proper is a date variable formatted in a way that STATA understands that it is a date variable.
What is often the case is that after importing data from other software such as Microsoft Excel you don’t have such proper STATA date variable. Below I give a few examples that should help with that.
FIXING DATE VARIABLES: Example 1
In the example below, we start with a variable that is simply a number. While this can be fixed in many ways, I suggest the following approach which I hope even STATA beginners will be able to digest easily. The code first converts the variable to a string. Next, that string is parsed into three components using the substr (stands for sub-string) command. Finally, the three components are destring-ed (converted to numeric) and combined into monthly date using the mdy (month,day,year) command.
*convert to string tostring date_numeric, g(date_string) *extract parts of the date_string variable gen year = substr(date_string,1,4) gen month = substr(date_string,5,2) gen day = substr(date_string,7,2) *destring the three date components to numeric variables destring year month day, replace *combine the three components to build a daily date variable gen date_proper = mdy(month, day, year) *format the variable as daily format %td date_proper
FIXING DATE VARIABLES: Example 2
Maybe the most common situation that you will come across is that your date variable gets imported as string (see data snapshot below). While here as well you could parse the string into pieces and build a date variable as we did before, you can also use a shortcut to get your proper date variable in one line of code.
Consider variable called date_string1. Using the date command you can fix this variable just by telling STATA what this string contains. The “MDY” option tells STATA that the date_string1 variable has a month, day, and year – appearing in that specific way. So we can extract the date1 variable in only one line of code.
The second example is slightly more complex. Here the variable date_string2 is similar to the one before, but the year component has only two digits rather than four. Note that in this case we have to tell STATA whether “08”is the year 1908 or 2008. We do so by adding another option to the date command, where we specify the topyear that appears in our data. Please see examples below: if we specify 1908 we suggest STATA the correct year is 1908, and if we specify 2008 then we suggest STATA the correct year is 2008.
gen date1 = date(date_string1,"MDY") gen date2a = date(date_string2,"MDY",1908) gen date2b = date(date_string2,"MDY",2008) format %td date1 date2?
These examples by no means exhaust the situations that one may come across when working with date variables in STATA. However, in my opinion, they represent a good start. For more info and further help, please contact me.
Below I will be continuously adding some more commands without giving details on their usage, but I hope more advanced users will benefit from those.
g month = month(date(mtemp, "M")) //convert e.g. JAN to 1