The files presented here contain historical estimates of hydrologic and meteorologic variables as computed by personnel at NOAA's Great Lakes Environmental Research Laboratory in Ann Arbor, Michigan. Similar data in Excel spreadsheet format were published since the mid-1990s, but we are now changing over to simpler text-based formats for a variety of maintenance and usability reasons. While this process is underway you will see a mix of Excel-format and CSV-format (text) files. The exact format of the text files will vary. Data presented here will generally be monthly estimates of an aggregated nature (i.e. mean over the entire lake, entire land surface, or entire lake basin [lake+land]). These estimates will be the result of either a physical process model or a prescribed technique to utilize point observations. Please note in ALL cases that these values represent just one possible estimate, and that every data set is sensitive to variations in model forcing data and/or observational data. Historical data used to derive these estimates are generally obtained from federal data agencies in both the U.S. and Canada, and are not a product of observing systems/programs run by GLERL. Over time we have observed that the data available from these various data collection sources is updated and revised, which propagates into changes in the estimates that we compute. Additionally, we have often noticed issues in the raw data that made it past their standard QA/QC procedures. Note that they deal with MASSIVE quantities of data, and have only a few personnel assigned to the task, so it's probably not surprising that some erroneous data makes it past their automated process. In order to deal with the possibility of bad raw data, we have implemented our own filtering processes that attempt to identify and remove any obvious problem data values. In some cases where the remaining data is too sparse for our models, we then are forced to replace those missing values with some "reasonable" estimate prior to using the dataset for model computations. One frequently asked question regarding these data is, "Why are the values I see in the file today different from what was there last year for the same month? Shouldn't the value for MM,YYYY (e.g. June, 1985) be settled by now?" This is a reasonable question, and I agree with the rational idea that you would not expect historical values to undergo significant revision. You might expect them to change a very small amount, but sometimes these differences in our files can be quite large. So what's going on? There can actually be several factors in play: 1. The models we use for estimating many of these quantities can change. These may be methodology changes (typically minor) or recalibrations of the model parameters. It is also possible, particularly in the future, that we will use a different model. All of these model changes will, obviously, result in changes to the output variables such as evaporation. 2. Similarly, the methods used to aggregate station data into areal averages can change. We typically use a thiessen polygon method for aggregating station meteorological data. If/when the underlying map used for this process is updated, the results will change. 3. The aforementioned issue of revisions to the underlying station/gauge data by the data agencies is one possibility. Each time we revise our monthly data sets, we pull a fresh copy of the station data to be aggregated and used in the models. When the data agency has identified an issue with the older data and revised it, the station data I use today will reflect that revised (or removed) value, and the aggregated values will change accordingly. 4. We will occasionally note persistent errors with the raw data from a station. When that happens, we will typically just remove it from consideration to "be on the safe side". These stations may only be exhibiting the obvious problems for the most recent few years, but we will eliminate that station from use for the entire period of record because it would be much more complicated for our procedures to try to only eliminate the recent years. 5. The criteria used for filtering erroneous input data values has been modified several times over the years. These filters started out as just a few very simple range checks, but have expanded over time as we discovered more issues that slipped past the filters. 6. We have, on occasion, uncovered errors in our processing software. When that happens, we have to fix the problem and then create new files. As you would expect, that will result in revised data. 7. Similarly, with the old Excel files, there was the potential for a copy/paste mistake. They were built and updated manually by copying in tables of data in text format. On a few occasions I copied the wrong set of data (e.g. the table for Lake Erie instead of Lake Ontario or the data for 1980-1990 instead of 1981-1991). Monthly files currently available: runoff__arm.csv These files are an aggregation of estimated streamflow into each lake from the land surface. The estimates are derived by extrapolating streamflow observations from a selected set of individual gauges, using a fairly simple Area-Ratio Method (ARM). For more information, please see the publication "Development and application of a North American Great Lakes hydrometeorological database - Part I: Precipitation, evaporation, runoff, and air temperature". prc___mon.csv These files contain aggregated precipitation for each of the specified areas. The value of will either be "lake", "land" or "basn", indicating overland, overlake or overbasin (land + lake). Because it is often requested, I am now adding some daily datasets that will be located in a "daily" subdirectory. subdata_*.csv These daily files contain meteorology variables for either a single "subbasin" or some aggregation of subbasins. The subdata_???00.csv files are for the overlake area. (Our internal software system uses 0 as the subbasin number for overlake.) The subdata_???_land.csv and subdata_???_basn.csv files contain overland and overbasin values, as you would expect. These daily files are the source used when computing monthly values that get posted. References: HUNTER, T.S., A.H. CLITES, A.D. GRONEWOLD, and K.B. CAMPBELL. Development and application of a North American Great Lakes hydrometeorological database - Part I: Precipitation, evaporation, runoff, and air temperature. Journal of Great Lakes Research 41(1):65-77 (DOI:10.1016/j.jglr.2014.4.12.006) (2015). https://www.glerl.noaa.gov/pubs/fulltext/2015/20150006.pdf