Homogenization in climate research means the removal of non-climatic changes. Next to changes in the climate itself, raw climate records also contain non-climatic jumps and changes, for example due to relocations or changes in instrumentation. The most used principle to remove these inhomogeneities is the relative homogenization approach in which a candidate station is compared to a reference time series based on one or more neighboring stations. The candidate and reference station(s) experience about the same climate, non-climatic changes that happen only in one station can thus be identified and removed.
To study climate change and variability, long instrumental climate records are essential, but are best not used directly. These datasets are essential since they are the basis for assessing century-scale trends or for studying the natural (long-term) variability of climate, amongst others. The value of these datasets, however, strongly depends on the homogeneity of the underlying time series. A homogeneous climate record is one where variations are caused only by variations in weather and climate. Long instrumental records are rarely, if ever homogeneous.
Results from the homogenization of instrumental western climate records indicate that detected inhomogeneities in mean temperature series occur at a frequency of roughly 15 to 20 years. It should be kept in mind that most measurements have not been specifically made for climatic purposes, but rather to meet the needs of weather forecasting, agriculture and hydrology. Moreover, the typical size of the breaks is often of the same order as the climatic change signal during the 20th century. Inhomogeneities are thus a significant source of uncertainty for the estimation of secular trends and decadal-scale variability.
If all inhomogeneities would be purely random perturbations of the climate records, collectively their effect on the mean global climate signal would be negligible. However, certain changes are typical for certain periods and occurred in many stations, these are the most important causes as they can collectively lead to artificial biases in climate trends across large regions. 
Causes of inhomogeneities
The best known inhomogeneity is the urban heat island effect. The temperature in cities can be warmer than in the surrounding country side, especially at night. Thus as cities grow, one may expect that temperatures measured in cities become higher. On the other hand, with the advent of aviation, many meteorological offices and thus their stations have often been relocated from cities to nearby, typically cooler, airports.
Other non-climatic changes can be caused by changes in measurement methods. Meteorological instruments are typically installed in a screen to protect them from direct sun and wetting. In the 19th century it was common to use a metal screen in front of a window on a North facing wall. However, the building may warm the screen leading to higher temperature measurements. When this problem was realized the Stevenson screen was introduced, typically installed in gardens, away from buildings. This is still the most typical weather screen with its characteristic double-louvre door and walls for ventilation. The historical Montsouri and Wilds screens were used around 1900 and are open to the North and to the bottom. This improves ventilation, but it was found that infra-red radiation from the ground can influence the measurement on sunny calm days. Therefore, they are no longer used. Nowadays automatic weather stations, which reduce labor costs, are becoming more common; they protect the thermometer by a number of white plastic cones. This necessitated changes from manually recorded liquid and glass thermometers to automated electrical resistance thermometers, which reduced the recorded temperature values in the USA.
Also other climate elements suffer from inhomogeneities. The precipitation amounts observed in the early instrumental period, roughly before 1900, are biased and are 10% lower than nowadays because the precipitation measurements were often made on a roof. At the time, instruments were installed on rooftops to ensure that the instrument is never shielded from the rain, but it was found later that due to the turbulent flow of the wind on roofs, some rain droplets and especially snow flakes did not fall into the opening. Consequently, measurements are nowadays performed closer to the ground.
Other typical causes of inhomogeneities are a change in measurement location; many observations, especially of precipitation are performed by volunteers in their garden or at their work place. Changes in the surrounding can often not be avoided, e.g., changes in the vegetation, the sealing of the land surface, and warm and sheltering buildings in the vicinity. There are also changes in measurement procedures such as the way the daily mean temperature is computed (by means of the minimum and maximum temperatures, or by averaging over 3 or 4 readings per day, or based on 10-minute data). Also changes in the observation times can lead to inhomogeneities. A recent review by Trewin focused on the causes of inhomogeneities.
The inhomogeneities are not always errors. This is seen most clear for stations affected by warming due to the urban heat island effect. From the perspective of global warming, such local effects are undesirable, but to study the influence of climate on health such measurements are fine. Other inhomogeneities are due to compromises that have to be made between ventilation and protection against the sun and wetting in the design of a weather shelter. Trying to reduce one type of error (for a certain weather condition) in the design will often lead to the more errors from the other factors. Meteorological measurements are not made in the laboratory. Small errors are inevitable and may not be relevant for meteorological purposes, but if such an error changes, it may well be an inhomogeneity for climatology.
To reliably study the real development of the climate, non-climatic changes have to be removed. The date of the change is often documented (called meta data: data about data), but not always. Meta data is often only available in the local language. In the best case, there are parallel measurements with the original and the new set-up for several years. This is a WMO (World Meteorological Organisation) guideline, but parallel measurements are unfortunately not very often performed, if only because the reason for stopping the original measurement is not known in advance, but probably more often to save money. By making parallel measurement with replicas of historical instruments, screens, etc. some of these inhomogeneities can still be studied today.
Because you are never sure that your meta data (station history) is complete, statistical homogenization should always be applied as well. The most commonly used statistical principle to detect and remove the effects of artificial changes is relative homogenization, which assumes that nearby stations are exposed to almost the same climate signal and that thus the differences between nearby stations can be utilized to detect inhomogeneities. By looking at the difference time series, the year-to-year variability of the climate is removed, as well as regional climatic trends. In such a difference time series, a clear and persistent jump of, for example 1 °C, can easily be detected and can only be due to changes in the measurement conditions.
If there is a jump (break) in a difference time series, it is not yet clear which of the two stations it belongs to. Furthermore, time series typically have more than just one jump. These two features make statistical homogenization a challenging and beautiful statistical problem. Homogenization algorithms typically differ in how they try to solve these two fundamental problems.
In the past, it was customary to compute a composite reference time series computed from multiple nearby stations, compare this reference to the candidate series and assume that any jumps found are due to the candidate series. The latter assumption works because by using multiple stations as reference, the influence of inhomogeneities on the reference are much reduced. However, modern algorithms, no longer assume that the reference is homogeneous and can achieve better results this way. There are two main ways to do so. You can compute multiple composite reference time series from subsets of surrounding stations and test these references for homogeneity as well.  Alternatively, you can only use pairs of stations and by comparing all pairs with each other determine which station most likely is the one with the break. If there is a break in 1950 in pair A&B and B&C, but not in A&C, the break is likely in station B; with more pairs such an inference can be made with more certainty.
If there are multiple breaks in a time series, the number of combinations easily becomes very large and it is becomes impossible to try them all. For example, in case of five breaks (k=5) in 100 years of annual data (n=100), the number of combinations is about 1005=1010 or 10 billion. This problem is sometimes solved iteratively/hierarchically, by first searching for the largest jump and then repeating the search in both sub-sections until they are too small. This does not always produce good results. A direct way to solve the problem is by an efficient optimization method called dynamic programming.
Sometimes there are no other stations in the same climate region. In this case, sometimes absolute homogenization is applied and the inhomogeneities are detected in the time series of one station. If there is a clear and large break at a certain date, one may be able to correct it, but smaller jumps and gradually occurring inhomogeneities (urban heat island or a growing vegetation) cannot be distinguished from real natural variability and climate change. Data homogenized this way does not have the quality you may expect and should be used with much care.
Inhomogeneities in climate data
By homogenizing climate datasets, it was found that sometimes inhomogeneities can cause biased trends in raw data; that homogenization is indispensable to obtain reliable regional or global trends. For example, for the Greater Alpine Region a bias in the temperature trend between the 1870s and 1980s of half a degree was found, which was due to decreasing urbanization of the network and systematic changes in the time of observation. The precipitation records of the early instrumental period are biased by -10% due to the systematic higher installation of the gauges at the time. Other possible bias sources are new types of weather shelters the change from liquid and glass thermometers to electrical resistance thermometers, as well as the tendency to replace observers by automatic weather stations, the urban heat island effect and the transfer of many urban stations to airports.
Moreover, state-of-the-art relative homogenization algorithms developed to work with an inhomogeneous reference are shown to perform best. The study (from EGU) showed that automatic algorithms can perform as well as manual ones.
- Auer, I., R. Bohm, A. Jurkovic, W. Lipa, A. Orlik, R. Potzmann, W. Schoner, M. Ungersbock, C. Matulla, P. Jones, D. Efthymiadis, M. Brunetti, T. Nanni, K. Briffa, M. Maugeri, L. Mercalli, O. Mestre, et al. "HISTALP - Historical instrumental climatological surface time series of the Greater Alpine Region". Int. J. Climatol., 27, pp. 17-46, doi:10.1002/joc.1377, 2007.
- Menne, M. J., Williams, C. N. jr., and Vose, R. S.: "The U.S. historical climatology network monthly temperature data, version 2". Bull. Am. Meteorol. Soc., 90, (7), 993-1007, doi:10.1175/2008BAMS2613.1, 2009.
- Brunetti M., Maugeri, M., Monti, F., and Nanni, T.: Temperature and precipitation variability in Italy in the last two centuries from homogenized instrumental time series. International Journal of Climatology, 26, pp. 345–381, doi:10.1002/joc.1251, 2006.
- Caussinus, H. and Mestre, O.: "Detection and correction of artificial shifts in climate series". Journal of the Royal Statistical Society: Series C (Applied Statistics), 53 (3), 405-425, doi:10.1111/j.1467-9876.2004.05155.x, 2004.
- Della-Marta, P. M., Collins, D., and Braganza, K.: "Updating Australia’s high quality annual temperature dataset". Austr. Meteor. Mag., 53, 277-292, 2004.
- Williams, C. N. jr., Menne, M. J., Thorne, P.W. "Benchmarking the performance of pairwise homogenization of surface temperatures in the United States. Journal of Geophysical Research-Atmospheres", 117, D5, doi:10.1029/2011JD016761, 2012.
- Menne, M. J., Williams, C. N. jr., and Palecki M. A.: "On the reliability of the U.S. surface temperature record". J. Geophys. Res. Atmos., 115, no. D11108, doi:10.1029/ , 2010.
- Begert, M., Schlegel, T., and Kirchhofer, W.: "Homogeneous temperature and precipitation series of Switzerland from 1864 to 2000". Int. J. Climatol., doi:10.1002/joc.1118, 25, 65–80, 2005.
- Trewin, B.: "Exposure, instrumentation, and observing practice effects on land temperature measurements". WIREs Clim. Change, 1, 490–506, doi:10.1002/wcc.46, 2010.
- Meulen, van der, J.P. and T. Brandsma. "Thermometer screen intercomparison in De Bilt (The Netherlands), part I: Understanding the weather-dependent temperature differences". Int. J. Climatol., doi:10.1002/joc.1531, 28, 371-387, 2008.
- Aguilar E., Auer, I., Brunet, M., Peterson, T. C., and Wieringa, J.: Guidelines on climate metadata and homogenization. World Meteorological Organization, WMO-TD No. 1186, WCDMP No. 53, Geneva, Switzerland, 55 p., 2003.
- Conrad, V. and Pollak, C.: Methods in Climatology. Harvard University Press, Cambridge, MA, 459 p., 1950.
- Venema, V., O. Mestre, E. Aguilar, I. Auer, J.A. Guijarro, P. Domonkos, G. Vertacnik, T. Szentimrey, P. Stepanek, P. Zahradnicek, J. Viarre, G. Müller-Westermeier, M. Lakatos, C.N. Williams, M.J. Menne, R. Lindau, D. Rasol, E. Rustemeier, K. Kolokythas, T. Marinova, L. Andresen, F. Acquaotta, S. Fratianni, S. Cheval, M. Klancar, M. Brunetti, Ch. Gruber, M. Prohom Duran, T. Likso, P. Esteban, Th. Brandsma. "Benchmarking homogenization algorithms for monthly data". Climate of the Past, 8, 89-115, doi:10.5194/cp-8-89-2012, 2012.
- Alexandersson, A.: "A homogeneity test applied to precipitation data". J. Climatol., doi:10.1002/joc.3370060607, 6, 661-675, 1986.
- Szentimrey, T.: "Multiple Analysis of Series for Homogenization (MASH)". Proceedings of the second seminar for homogenization of surface climatological data, Budapest, Hungary; WMO, WCDMP-No. 41, 27-46, 1999.
- Böhm R., Auer, I., Brunetti, M., Maugeri, M., Nanni, T., and Schöner, W.: "Regional temperature variability in the European Alps 1760–1998 from homogenized instrumental time series". International Journal of Climatology, doi:10.1002/joc.689, 21, pp. 1779–1801, 2001.
- Auer I, Böhm, R., Jurkovic, A., Orlik, A., Potzmann, R., Schöner W., et al.: A new instrumental precipitation dataset for the Greater Alpine Region for the period 1800–2002. International Journal of Climatology, doi:10.1002/joc.1135, 25, 139–166, 2005.
- Brunet, M., Asin, J., Sigró, J., Banón, M., García, F., Aguilar, E., Esteban Palenzuela, J., Peterson, T. C., and Jones, P.: "The minimization of the screen bias from ancient Western Mediterranean air temperature records: an exploratory statistical analysis". Int. J. Climatol., doi:10.1002/joc.2192, 2010.