remote sensing  

Article

Individual Tree Detection and Classification with
UAV-Based Photogrammetric Point Clouds and
Hyperspectral Imaging

Olli Nevalainen 1,*, Eija Honkavaara 1, Sakari Tuominen 2, Niko Viljanen 1, Teemu Hakala 1,
Xiaowei Yu 1, Juha Hyyppä 1, Heikki Saari 3, Ilkka Pölönen 4, Nilton N. Imai 5 and
Antonio M. G. Tommaselli 5

1 Finnish Geospatial Research Insititute, National Land Survey of Finland, Geodeetinrinne 2, 02430 Masala,
Finland; eija.honkavaara@nls.fi (E.H.); niko.viljanen@nls.fi (N.V.); teemu.hakala@nls.fi (T.H.);
xiaowei.yu@nls.fi (X.Y.); juha.hyyppa@nls.fi (J.H.)

2 Natural Resources Institute Finland, PL 2 00791 Helsinki, Finland; sakari.tuominen@luke.fi
3 VTT Microelectronics, P.O. Box 1000, FI-02044 VTT, Finland; heikki.saari@vtt.fi
4 Department of Mathematical Information Tech., University of Jyväskylä, P.O. Box 35, FI-40014 Jyväskylä,

Finland; ilkka.polonen@jyu.fi
5 Department of Cartography, Univ. Estadual Paulista (UNESP), Presidente Prudente, SP 19060-900, Brazil;

nnimai@fct.unesp.br (N.N.I.); tomaseli@fct.unesp.br (A.M.G.T.)
* Correspondence: olli.nevalainen@nls.fi; Tel.: +358-50-911-9062

Academic Editors: Farid Melgani, Francesco Nex, Norman Kerle and Prasad S. Thenkabail
Received: 8 December 2016; Accepted: 18 February 2017; Published: 23 February 2017

Abstract: Small unmanned aerial vehicle (UAV) based remote sensing is a rapidly evolving technology.
Novel sensors and methods are entering the market, offering completely new possibilities to carry out
remote sensing tasks. Three-dimensional (3D) hyperspectral remote sensing is a novel and powerful
technology that has recently become available to small UAVs. This study investigated the performance
of UAV-based photogrammetry and hyperspectral imaging in individual tree detection and tree
species classification in boreal forests. Eleven test sites with 4151 reference trees representing various
tree species and developmental stages were collected in June 2014 using a UAV remote sensing system
equipped with a frame format hyperspectral camera and an RGB camera in highly variable weather
conditions. Dense point clouds were measured photogrammetrically by automatic image matching
using high resolution RGB images with a 5 cm point interval. Spectral features were obtained from
the hyperspectral image blocks, the large radiometric variation of which was compensated for by
using a novel approach based on radiometric block adjustment with the support of in-flight irradiance
observations. Spectral and 3D point cloud features were used in the classification experiment with
various classifiers. The best results were obtained with Random Forest and Multilayer Perceptron
(MLP) which both gave 95% overall accuracies and an F-score of 0.93. Accuracy of individual tree
identification from the photogrammetric point clouds varied between 40% and 95%, depending on
the characteristics of the area. Challenges in reference measurements might also have reduced these
numbers. Results were promising, indicating that hyperspectral 3D remote sensing was operational
from a UAV platform even in very difficult conditions. These novel methods are expected to provide
a powerful tool for automating various environmental close-range remote sensing tasks in the very
near future.

Keywords: UAV; hyperspectral; photogrammetry; radiometry; point cloud; forest; classification

Remote Sens. 2017, 9, 185; doi:10.3390/rs9030185 www.mdpi.com/journal/remotesensing

http://www.mdpi.com/journal/remotesensing
http://www.mdpi.com
http://www.mdpi.com/journal/remotesensing


Remote Sens. 2017, 9, 185 2 of 34

1. Introduction

Knowing the tree species composition of a forest enables the estimation of the forest’s economic
value and produces valuable information for studying forest ecosystems. Today, forest inventory in
Scandinavia is based on an area-based approach [1,2] where laser scanning (point density about
1 pts/m2) and aerial images (resolution typically 0.5 m) are used as inventory data. However,
using these approaches, species-specific diameter distributions have poor prediction accuracy and
improvement in tree species detection is needed.

Forest data using remote sensing methods has mostly been concentrating on forest stand (i.e., the
ecologically homogeneous and spatially continuous part of a forest) and plot level data. However,
stand-level forest variables are typically an average or sum from the set of trees composing the
stand. In calculating forest inventory variables such as the volume and biomass of the growing stock,
tree-level models are typically used nowadays [3–5]. Very high resolution remote sensing data allows
moving from the stand level to the level of individual trees, which has certain benefits, for example in
precision forestry, forest management planning, biomass estimation and modeling forest growth [6].

Increasing the level of detail of the forest can also improve detailed modeling of forests, which
can be used to predict forest growth and improve satellite-based remote sensing of forests by more
accurately modeling the radiative transfer within the forest canopy.

Forest and tree species classification using multi- or hyperspectral imaging or laser scanning
has been widely studied [7–9]. However, the data has mainly been captured from manned aircrafts
or satellites, wherefore studies have been focusing more on the forest or plot level. The challenge
in passive imaging has been the dependence on sunlight and the high impact of the changing and
different illumination conditions on the radiometry of the data [9–13]. Shadowing and brightening
of individual tree crowns cause the pixels of a single tree crown to scale from really dark pixels to
really bright pixels. A few methods have been developed and suggested to reduce the effect of the
changing illumination in the forest canopy [14], but none of them have been extensively tested with
high resolution spectral imaging data from individual trees. In addition, illumination changes can be
beneficial in classification tasks, since they potentially provide species-specific information of the tree
structure [15,16].

Airborne Laser Scanning (ALS), both discrete and full-waveform, has been used to classify trees
in boreal forests [17–21]. The most demanding task using passive imaging systems has always been
discriminating pines from spruces due to their spectral similarity. However, structural data from laser
scanning has shown it can capture the structural differences of these species, such as the vertical extent
of the leafy canopy [18,22,23].

Individual trees have been detected using passive data [24], mainly using image segmentation
using textural features [25,26], but the development of dense image matching methods and computing
power has enabled the production of high resolution photogrammetric point clouds. Their ability to
produce structural data from a forest has been presented in the literature [27–30]. Photogrammetric
point clouds produce great top-of-the-canopy information that enables the computation of Canopy
Height Models (CHM) [29] and thus detection of individual trees [31]. However, the notable limitation
of photogrammetric point clouds compared to laser scanning is that passive imaging does not have
good penetration ability [30], especially with aerial data from manned aircrafts. Thus, laser scanning
that can more deeply penetrate forest canopies has been the main method of providing information on
forest structure [1,2], especially at individual tree level [32]. Photogrammetric point clouds derived
from data captured using manned aircrafts have been used, but the spatial resolution is not usually
accurate enough to produce good detection results.

Small unmanned aerial vehicles (UAVs) have been rapidly incorporated in various remote sensing
applications, including forestry [33–35]. The use of UAVs in aerial imaging has enabled measurements
with higher spatial resolution, improving the resolution of photogrammetric point clouds and the
acquisition of three-dimensional (3D) structural data from the forest [36,37]. Some studies have been
using data from UAVs in individual tree height determination with good results [38]. Multispectral data


Remote Sens. 2017, 9, 185 3 of 34

acquired from UAVs has been used in tree species classification, but the data has been limited to RGB
images and one near-infrared (NIR) channel using NIR-modified cameras [39–41]. The development
of low-weight hyperspectral imaging sensors is likely to increase the use of spectral data collected
from UAVs in forestry applications [34].

Recently, the development of small hyperspectral imaging sensors has enabled high spectral
and spatial resolution measurements from UAVs. Several pushbroom hyperspectral sensors have
been implemented in UAVs [42–46]. The novel hyperspectral cameras operating in the frame format
principle offer interesting possibilities for UAV remote sensing by stable imaging geometry and by
giving an opportunity to make 3D hyperspectral measurements [47–49]. Näsi et al. [50] presented the
first study with 3D hyperspectral UAV imaging in individual tree-level analysis of bark beetle damage
in spruce forests. To the best of the authors’ knowledge, the classification of individual trees using
hyperspectral imagery from UAVs has not yet been studied.

Novel hyperspectral imaging technology based on a variable air gap Fabry–Pérot interferometer
(FPI) operating in the visible to near-infrared spectral range (500–900 nm; VNIR) [47–49] was used
in this study. The FPI technology makes it possible to manufacture a lightweight, frame format
hyperspectral imager operating on the time-sequential principle. The FPI camera can be easily
mounted on a small UAV together with an RGB camera, which enables the simultaneous hyperspectral
imaging with high spatial resolution photogrammetric point cloud creation [50].

The objective of this study is to test the use of high resolution photogrammetric point clouds
together with hyperspectral UAV imaging in individual tree detection and classification in boreal
forests. In particular, the data processing challenges in a real forest environment will be studied and
the importance of different spectral and structural features in tree species classification will be assessed.
Previous studies using UAV data have not utilized both spectral and structural data in tree species
classification, nor have they provided reliable complete workflows to detect and classify individual
trees from large forested scenes. The materials and methods are presented in Section 2. The results are
given in Section 3 and discussed Section 4. Conclusions are provided in Section 5.

2. Materials and Methods

2.1. Study Area and Reference Tree Data Collection

The study area was the Vesijako research forest area in the municipality of Padasjoki in southern
Finland (approximately 61◦24′N and 25◦02′E). The area has been used as a research forest by the
Natural Resources Institute of Finland (and its predecessor, the Finnish Forest Research Institute).
Eleven experimental test sites from stands dominated by pine (Pinus sylvestris), spruce (Picea abies),
birch (Betula pendula) and larch (Larix sibirica) were used in this study. The test sites represented
development stages from young to middle-aged and mature stands (no seedling stands or clear cut
areas). Within each test site, there were 1–16 experimental plots treated with differing silvicultural
schemes and cutting systems (altogether 56 experimental plots). The size of the experimental plots was
1000–2000 m2. Within the experimental plots, all trees with breast-height diameter of at least 50 mm
were measured as tally trees in 2012–2013. For each tally tree, the following variables were recorded:
relative location within the plot, tree species, and diameter at breast height. Height was measured for
a number of sample trees in each plot and estimated for all tally trees using height models calibrated
by sample tree measurements. The geographic location of the experimental plot corner points was
measured with a Global Positioning System (GPS) device, and the locations were processed with local
base station data, with an average error of approximately 1 m.

For this study, a total of 300 fixed-radius (9 m) circular sample plots were placed on the
experimental plots in the eleven test sites (4–8 circular plots per each experimental plot, depending
on the size and shape of the experimental plots). The plot variables of the circular sample plots were
calculated on the basis of the tree maps of the experimental plots. The statistics of the main forest
variables in the field measurement data are presented in Tables 1 and 2. Some of the sample plots


Remote Sens. 2017, 9, 185 4 of 34

have an exceptionally high amount of growing stock compared to values typical for this geographic
area, which can be seen in the maximum values and standard deviation of the volumes in the field
observations of this study area. The areas with the highest amount of growing stock are dominated by
pine and larch. Figure 1 shows the locations of the study areas.

Table 1. Average, maximum (Max), minimum (Min) and standard deviation (Std.) of field variables in
the field data in the study area.

Forest Variable Average Max Min Std.

Total volume, m3/ha 328.3 1160.8 33.2 220.4
Volume of Scots pine *, m3/ha 240.0 1110.8 0 212.2

Volume of Norway spruce, m3/ha 46.6 420.0 0 88.7
Volume of broadleaved, m3/ha 41.7 352.6 0 84.5

Mean diameter, cm 22.9 55.4 13.9 7.6
Mean height, m 21.2 39.4 14.3 5.1

Basal area, m2/ha 31.3 78.5 3.7 14.6

* Including Larix sp.

Table 2. Forest plot densities in the study area. Densities are presented as trees per hectare.

Test Site Min Max Mean Median Number Plots

v01 676 2964 1699 1756 16
v02 310 1938 898 625 11

v0304 333 1247 657 594 9
v05 173 1666 800 715 6
v06 1381 1381 1381 1381 1
v07 850 850 850 850 1
v08 766 983 855 818 3
v09 435 2701 1909 2250 4
v10 691 2016 1354 1354 2
v11 468 468 468 468 1

Remote Sens. 2017, 9, 185  4 of 33 

 
observations of this study area. The areas with the highest amount of growing stock are dominated 
by pine and larch. Figure 1 shows the locations of the study areas. 

Table 1. Average, maximum (Max), minimum (Min) and standard deviation (Std.) of field variables 
in the field data in the study area. 

Forest Variable Average Max Min Std. 
Total volume, m3/ha 328.3 1160.8 33.2 220.4 

Volume of Scots pine *, m3/ha 240.0 1110.8 0 212.2 
Volume of Norway spruce, m3/ha 46.6 420.0 0 88.7 

Volume of broadleaved, m3/ha 41.7 352.6 0 84.5 
Mean diameter, cm 22.9 55.4 13.9 7.6 

Mean height, m 21.2 39.4 14.3 5.1 
Basal area, m2/ha 31.3 78.5 3.7 14.6 

* Including Larix sp. 

Table 2. Forest plot densities in the study area. Densities are presented as trees per hectare. 

Test Site Min Max Mean Median Number Plots
v01 676 2964 1699 1756 16 
v02 310 1938 898 625 11 

v0304 333 1247 657 594 9 
v05 173 1666 800 715 6 
v06 1381 1381 1381 1381 1 
v07 850 850 850 850 1 
v08 766 983 855 818 3 
v09 435 2701 1909 2250 4 
v10 691 2016 1354 1354 2 
v11 468 468 468 468 1 

 
Figure 1. The geographical locations of the study areas. The area is located in the Vesijako research 
forest area in the municipality of Padasjoki in southern Finland (approximately 61°24′N and 25°02′E). 

Figure 1. The geographical locations of the study areas. The area is located in the Vesijako research
forest area in the municipality of Padasjoki in southern Finland (approximately 61◦24′N and 25◦02′E).


Remote Sens. 2017, 9, 185 5 of 34

The reference trees collected in the field measurements were visually compared to the collected
UAV orthomosaics in order to check their geometric correspondence at the individual tree level
(orthomosaic calculation is described in Section 2.3). Some misalignment could be observed, which
was due to challenges in tree positioning in the ground conditions, due to the georeferencing quality
of the UAV orthomosaics, as well as due to different characteristics at the object view and the aerial
view. The misaligned reference trees were manually aligned with the trees in the UAV orthomosaics or
removed if the corresponding tree could not be identified reliably from the UAV orthomosaics. This
was performed using Quantum GIS (QGIS) by overlaying the field data over the RGB and FPI mosaics.

2.2. Remote Sensing Data Capture

Altogether, 11 test sites were captured using a small UAV in eight separate flights on 25–26 June
2014, in the Vesijako test site (Tables 3 and 4). For georeferencing purposes, three to nine Ground
Control Points (GCPs) with cross-shaped signals with an arm length of 3 m and width of 30 cm were
installed in each test area. Reflectance panels of size 1 m × 1 m and with a nominal reflectivity of
0.03, 0.1 and 0.5 were installed in the area for reflectance transformation purposes [51]. The reference
reflectance values were measured in a laboratory with an estimated accuracy of 2%–5% using the
FIGIFIGO goniospectrometer [52].

The UAV remote system belonging to the Finnish Geospatial Research Institute (FGI) was used
in the data capture. The UAV platform frame was a Tarot 960 hexacopter with Tarot 5008 (340 KV)
brushless electric motors. The autopilot was a Pixhawk equipped with Arducopter 3.15 firmware.
The system’s payload capacity is about 3 kg and the flight time is 10–30 min, depending on
payload, battery, conditions, and flight style. A novel hyperspectral camera based on a tuneable
Fabry–Pérot interferemoter (FPI) [47–49] was used to capture the spectral data. The FPI camera
captures frame-format hyperspectral images in a time-sequential mode. Due to the sequential exposure
of the individual bands (0.075 s between adjacent exposures, 1.8 s during the entire data cube with
24 exposures), each band of the data cube has a slightly different position and orientation, which has
to be taken into account in the post-processing phase. The image size was 1024 × 648 pixels, and
the pixel size was 11 µm. The FPI camera has a focal length of 10.9 mm; the field of view (FOV) is
±18◦ in the flight direction, ±27◦ in the cross-flight direction, and ±31◦ at the format corner. The
camera system has an irradiance sensor (based on the Intersil ISL29004 photodetector) to measure
the wideband irradiance during each exposure [53]. The UAV was also equipped with the Ocean
Optics spectrometer and irradiance sensor to monitor illumination conditions; due to some technical
problems, the data quality was not suitable for the radiometric correction. A GPS receiver is used
to record the exact time of the first exposure of each data cube. Spectral settings can be selected
according to the requirements. In this study, altogether, 33 bands were used with the full width of the
half maximum (FWHM) of 11–31 nm. The settings used in this study are given in Table 5. In order
to capture high spatial resolution data, the UAV was also equipped with an ordinary RGB compact
digital camera, the Samsung NX1000. The camera has a 23.5 × 15.7 mm complementary metal-oxide
semiconductor (CMOS) sensor with 20.3 megapixels and a 16 mm lens. The UAV system is presented
in Figure 2.

The flying height was 83–94 m from the ground level, providing an average Ground Sampling
Distance (GSD) of 8.6 cm for the FPI images and 2.3 cm for the RGB images on the ground level. The
flight height was 62–73 m from the tree tops (calculated for an average tree height of 21 m). Thus, the
average GSDs were 6.5 cm and 1.8 cm at tree tops for the FPI and RGB data sets, respectively. The
flight speed was about 4 m/s. The FPI image blocks had average forward and side overlaps of 67%
and 61%, respectively, at the nominal ground level and 58% and 50%, respectively, at the tree top
level. For the RGB blocks, the average forward and side overlaps were 78% and 73%, respectively,
at the ground level. At the level of treetops, the average forward and side overlaps were 72% and
65%, respectively. The overlaps of RGB blocks were suitable for the orientation processing and point
cloud generation. The overlapping of FPI images were quite low in the forested scene with large


Remote Sens. 2017, 9, 185 6 of 34

height differences, thus the combined processing with RGB images was necessary to enable the highest
quality geometric reconstruction.

Table 3. Flight conditions and camera settings during the flights. Median irradiance was taken from
Intersil ISL29004 irradiance measurements.

Test Site Date Time (GPS*) Weather Solar Elevation Sun Azimuth Median Irrad Exposure (ms)

v01 26.6 11:07 to 11:23 Cloudy 50.91 199.22 2602 10
v02 26.6 12:09 to 12:22 Cloudy 47.38 219.63 4427 12

v0304 25.6 10:38 to 10:51 Varying 51.79 188.36 variable 6
v05 25.6 09:26 to 09:40 Varying 50.93 160.69 variable 6

v0607 25.6 12:14 to 12:24 Cloudy 47 221.30 3773 10
v08 26.6 09:58 to 10:09 Sunny 51.84 173.41 13894 10

v0910 25.6 13:51 to 14:12 Cloudy 37.20 249.45 2546 8
v11 26.6 08:49 to 08:58 Varying 49.1 148.44 13982 10

*GPS, Global Positioning System.

Table 4. Properties of image blocks calculated at ground level and at tree top level (N gcp: number of
ground control points (GCPs), FH: flying height; fw;sl: forward and side overlaps; FPI: Fabry–Pérot
interferometer, GSD: ground sampling distance).

Block N GCP

Ground Treetops

FH
(m)

FPI RGB
FH
(m)

FPI RGB

GSD
(m)

fw;sl
(%)

GSD
(m)

fw;sl
(%)

GSD
(m)

fw;sl
(%)

GSD
(m)

fw;sl
(%)

v01 7 94 0.094 64;54 0.025 76;68 73 0.073 54;41 0.020 69;59
v02 4 88 0.088 65;57 0.024 77;70 67 0.067 53;43 0.018 69;61

v0304 9 85 0.085 69;62 0.023 79;73 64 0.064 59;49 0.017 73;65
v05 5 86 0.086 59;34 0.023 73;54 65 0.065 59;34 0.017 73;54
v06 3 94 0.094 78;79 0.025 85;86 73 0.073 71;73 0.020 80;81
v07 4 94 0.094 73;71 0.025 82;80 73 0.073 65;63 0.020 77;74
v08 5 86 0.086 64;64 0.023 76;75 65 0.065 52;52 0.017 68;67
v09 4 84 0.084 69;57 0.023 79;70 63 0.063 58;43 0.017 72;60
v10 3 83 0.083 69;67 0.022 79;77 62 0.062 58;56 0.017 72;70
v11 4 83 0.083 60;61 0.022 74;73 62 0.062 47;47 0.017 65;63

Average 86 0.086 67;61 0.023 78;73 65 0.065 58;50 0.018 72;65

Remote Sens. 2017, 9, 185  6 of 33 

 
height differences, thus the combined processing with RGB images was necessary to enable the 
highest quality geometric reconstruction. 

Table 3. Flight conditions and camera settings during the flights. Median irradiance was taken from 
Intersil ISL29004 irradiance measurements. 

Test Site Date Time (GPS*) Weather Solar Elevation Sun Azimuth Median Irrad Exposure (ms)
v01 26.6 11:07 to 11:23 Cloudy 50.91 199.22 2602 10 
v02 26.6 12:09 to 12:22 Cloudy 47.38 219.63 4427 12 

v0304 25.6 10:38 to 10:51 Varying 51.79 188.36 variable 6 
v05 25.6 09:26 to 09:40 Varying 50.93 160.69 variable 6 

v0607 25.6 12:14 to 12:24 Cloudy 47 221.30 3773 10 
v08 26.6 09:58 to 10:09 Sunny 51.84 173.41 13894 10 

v0910 25.6 13:51 to 14:12 Cloudy 37.20 249.45 2546 8 
v11 26.6 08:49 to 08:58 Varying 49.1 148.44 13982 10 

*GPS, Global Positioning System. 

Table 4. Properties of image blocks calculated at ground level and at tree top level (N gcp: number of 
ground control points (GCPs), FH: flying height; fw;sl: forward and side overlaps; FPI: Fabry–Pérot 
interferometer, GSD: ground sampling distance). 

Block N GCP 

Ground Treetops 

FH (m) 
FPI RGB

FH 
(m) 

FPI RGB 
GSD 
(m) 

fw;sl 
(%) 

GSD 
(m) 

fw;sl
(%) 

GSD 
(m) 

fw;sl 
(%) 

GSD 
(m) 

fw;sl 
(%) 

v01 7 94 0.094 64;54 0.025 76;68 73 0.073 54;41 0.020 69;59 
v02 4 88 0.088 65;57 0.024 77;70 67 0.067 53;43 0.018 69;61 

v0304 9 85 0.085 69;62 0.023 79;73 64 0.064 59;49 0.017 73;65 
v05 5 86 0.086 59;34 0.023 73;54 65 0.065 59;34 0.017 73;54 
v06 3 94 0.094 78;79 0.025 85;86 73 0.073 71;73 0.020 80;81 
v07 4 94 0.094 73;71 0.025 82;80 73 0.073 65;63 0.020 77;74 
v08 5 86 0.086 64;64 0.023 76;75 65 0.065 52;52 0.017 68;67 
v09 4 84 0.084 69;57 0.023 79;70 63 0.063 58;43 0.017 72;60 
v10 3 83 0.083 69;67 0.022 79;77 62 0.062 58;56 0.017 72;70 
v11 4 83 0.083 60;61 0.022 74;73 62 0.062 47;47 0.017 65;63 

Average  86 0.086 67;61 0.023 78;73 65 0.065 58;50 0.018 72;65 

 
Figure 2. Unmanned aerial vehicle (UAV) remote sensing system of the Finnish Geospatial Research 
Institute (FGI) is based on the Tarot 960 hexacopter. 

  
Figure 2. Unmanned aerial vehicle (UAV) remote sensing system of the Finnish Geospatial Research
Institute (FGI) is based on the Tarot 960 hexacopter.


Remote Sens. 2017, 9, 185 7 of 34

Table 5. Spectral settings of the Fabry–Perot interferometer (FPI) camera at visible (VIS) and
near-infrared (NIR) channels. L0: central wavelength; FWHM: full width at half maximum.

L0 (nm): 507.60, 509.50, 514.50, 520.80, 529.00, 537.40, 545.80, 554.40, 562.70, 574.20, 583.60, 590.40, 598.80,
605.70, 617.50, 630.70, 644.20, 657.20, 670.10, 677.80, 691.10, 698.40, 705.30, 711.10, 717.90, 731.30, 738.50, 751.50,
763.70, 778.50, 794.00, 806.30, 819.70

FWHM (nm): 11.2, 13.6, 19.4, 21.8, 22.6, 20.7, 22.0, 22.2, 22.1, 21.6, 18.0, 19.8, 22.7, 27.8, 29.3, 29.9, 26.9, 30.3, 28.5,
27.8, 30.7, 28.3, 25.4, 26.6, 27.5, 28.2, 27.4, 27.5, 30.5, 29.5, 25.9, 27.3, 29.9

Imaging conditions were quite windless, but illumination varied a lot in different flights.
Illumination conditions were cloudy and quite uniform during flights v01, v02, v0607 and v0910. Test
site v08 was captured under sunny conditions, and during flights v0304, v05 and v11 the illumination
conditions varied between sunny to cloudy. Irradiance recordings by the Intersil ISL29004 irradiance
sensor during the flights are presented in Figure 3. The recordings were quite uniform during the
flights captured in cloudy conditions. The variability in irradiance measurements during flights v0304,
v05, v08 and v11 were partially due to changing weather and partially due to tilting of the irradiance
sensor in different flight directions, thus obtaining different levels of irradiation.

Remote Sens. 2017, 9, 185  7 of 33 

 
Table 5. Spectral settings of the Fabry–Perot interferometer (FPI) camera at visible (VIS) and 
near-infrared (NIR) channels. L0: central wavelength; FWHM: full width at half maximum. 

L0 (nm): 507.60, 509.50, 514.50, 520.80, 529.00, 537.40, 545.80, 554.40, 562.70, 574.20, 583.60, 590.40, 
598.80, 605.70, 617.50, 630.70, 644.20, 657.20, 670.10, 677.80, 691.10, 698.40, 705.30, 711.10, 717.90, 
731.30, 738.50, 751.50, 763.70, 778.50, 794.00, 806.30, 819.70 
FWHM (nm): 11.2, 13.6, 19.4, 21.8, 22.6, 20.7, 22.0, 22.2, 22.1, 21.6, 18.0, 19.8, 22.7, 27.8, 29.3, 29.9, 
26.9, 30.3, 28.5, 27.8, 30.7, 28.3, 25.4, 26.6, 27.5, 28.2, 27.4, 27.5, 30.5, 29.5, 25.9, 27.3, 29.9 

Imaging conditions were quite windless, but illumination varied a lot in different flights. 
Illumination conditions were cloudy and quite uniform during flights v01, v02, v0607 and v0910. 
Test site v08 was captured under sunny conditions, and during flights v0304, v05 and v11 the 
illumination conditions varied between sunny to cloudy. Irradiance recordings by the Intersil 
ISL29004 irradiance sensor during the flights are presented in Figure 3. The recordings were quite 
uniform during the flights captured in cloudy conditions. The variability in irradiance 
measurements during flights v0304, v05, v08 and v11 were partially due to changing weather and 
partially due to tilting of the irradiance sensor in different flight directions, thus obtaining different 
levels of irradiation. 

The national ALS data by the National Land Survey of Finland (NLS) was used to provide the 
ground level, and it was also used as the geometric reference for evaluating the geometric quality of 
photogrammetric processing [54]. The minimum point density of the NLS’s ALS data is half a point 
per square metre, and the elevation accuracy of the points in well-defined surfaces is 15 cm. The 
horizontal accuracy of the data is 60 cm. The ALS data used in this study was collected on 12 May, 
2012. 

 
Figure 3. Irradiance recordings during different flights measured using the onboard irradiance sensor. 

2.3. UAV Data Processing 

Rigorous processing was required in order to derive quantitative information from the imagery. 
The processing of FPI camera images is similar to any frame format camera images that cover the 
area of interest with a large number of images. The major difference is the processing of the 
nonaligned spectral bands due to the time-sequential imaging principle (Section 2.2) for which a 
registration procedure has been developed. The image data processing chain for tree parameter 
estimation included the following steps: 

1. Applying radiometric laboratory calibration corrections to the FPI images. 
2. Determination of the geometric imaging model, including interior and exterior orientations of 

the images.  
3. Using dense image matching to create a Digital Surface Model (DSM). 
4. Registration of the spectral bands of FPI images. 

Figure 3. Irradiance recordings during different flights measured using the onboard irradiance sensor.

The national ALS data by the National Land Survey of Finland (NLS) was used to provide the
ground level, and it was also used as the geometric reference for evaluating the geometric quality of
photogrammetric processing [54]. The minimum point density of the NLS’s ALS data is half a point per
square metre, and the elevation accuracy of the points in well-defined surfaces is 15 cm. The horizontal
accuracy of the data is 60 cm. The ALS data used in this study was collected on 12 May 2012.

2.3. UAV Data Processing

Rigorous processing was required in order to derive quantitative information from the imagery.
The processing of FPI camera images is similar to any frame format camera images that cover the area
of interest with a large number of images. The major difference is the processing of the nonaligned
spectral bands due to the time-sequential imaging principle (Section 2.2) for which a registration
procedure has been developed. The image data processing chain for tree parameter estimation
included the following steps:

1. Applying radiometric laboratory calibration corrections to the FPI images.
2. Determination of the geometric imaging model, including interior and exterior orientations of

the images.
3. Using dense image matching to create a Digital Surface Model (DSM).


Remote Sens. 2017, 9, 185 8 of 34

4. Registration of the spectral bands of FPI images.
5. Determination of a radiometric imaging model to transform the digital numbers (DNs) data

to reflectance.
6. Calculating the hyperspectral and RGB image mosaics.
7. Subsequent remote sensing analysis

The image preprocessing and photogrammetric processing (steps 1–3) were carried out using
3.5 GHz quad-core PC with 88 GB RAM and a GeForce GTX 980 graphics processing unit (GPU).
For the band registration, radiometric processing and mosaic calculation (steps 4–6), several regular
office PCs were used in parallel. Remote sensing analysis was performed using 3.5 GHz quad-core PC
with 24 GB RAM.

In the following sections, the geometric (2, 3, 4) and radiometric (1, 5, 6) processing steps and
estimation process (7) used in this investigation are described.

2.3.1. Geometric Processing

Geometric processing included the determination of the orientations of the images and
determination of the 3D object model. The Agisoft PhotoScan Professional commercial software
(AgiSoft LLC, St. Petersburg, Russia) and the FGI’s in-house C++ software (spectral band registration)
were used for the geometric processing.

PhotoScan performs photo-based 3D reconstruction based on feature detection and dense
matching, and it is widely used in processing UAV images [50,55]. In the orientation processing,
the quality was set to “high”, which means that the full resolution images were used. The settings for
the number of key points per image were 40,000 and for the final number of tie points per image 1000; an
automated camera calibration was performed simultaneously with image orientation (self-calibration).
The initial processing provided image orientations and sparse point clouds in the internal coordinate
system of the software. For the data, an automatic outlier removal was performed on the basis of the
residuals (re-projection error) (10% of the points with the largest errors were removed), as well as
standard deviations of the tie point 3D coordinates (reconstruction uncertainty) (10% of the points
with the largest uncertainty were removed). Some outlier points were also removed manually from the
sparse point cloud (points on the sky and below the ground). The image orientations were transformed
to the ETRS-TM35FIN coordinate system using the GCPs in the area. The outputs of the process were
the camera calibrations (Interior Orientation Parameters—IOP), the image exterior orientations in the
object coordinate system (Exterior Orientation Parameters—EOP), and the 3D coordinates of the tie
points. Orientations of the FPI hypercubes were determined in a simultaneous processing with the
RGB images in the PhotoScan. Three FPI image bands (reference bands) were included simultaneously
in the processing: band 4: L0 = 520.8 nm, dt = 1.125 s; band 12: L0 = 590.4 nm, dt = 1.725 s; band
16: L0 = 630.7 nm, dt = 0.15 s. (L0 is the central wavelength and dt is the time difference to the first
exposure of the data cube.) The orientations of those FPI image bands that were not included in the
PhotoScan processing were determined using the space resection method developed at the FGI. This
method used the tie points calculated during the PhotoScan processing as GCPs and determined the
corresponding image coordinates by correlation matching of each unoriented band to the reference
band 4 and finally calculated the space resection. This processing produced EOPs for all the bands in
all hypercubes.

For accurate dense point cloud generation, the orientations of the RGB datasets were also
determined separately without the FPI images. Dense point clouds with 5 cm point interval were then
generated using two-times downsampled RGB images. A mild filtering was used to eliminate outliers,
which allowed high height differences for the data set.


Remote Sens. 2017, 9, 185 9 of 34

2.3.2. Radiometric Processing

Our preferred approach was to analyze different test areas simultaneously. Thus, it was necessary
to scale the radiometric values in each data set to a similar reference scale. Calibration of the DNs to
reflectance factors was the preferred approach. We installed the reflectance reference panels close to the
takeoff place in each test site in order to carry out the reflectance transformation using the empirical
line method. Unfortunately, the targets were inside the forest and surrounded by tall trees, thus the
illumination conditions in reference targets did not correspond to the illumination conditions on top
of the canopy. Because of this, they did not provide accurate reflectance calibration by the empirical
line method. A further challenge was that the illumination conditions were variable during many of
the flights, providing large relative differences in radiometric values within the blocks and between
different blocks (Figure 3).

A modified approach was used to eliminate radiometric differences of images. We applied
the radiometric block adjustment and in-flight irradiance data to normalize the differences within
and between the blocks [49,53,56]. We selected test site v06 as the reference test site and calculated
the empirical line calibration using this area (aabs_ref, babs_ref). The area for the reflectance panels
was relatively open in block v06, and the surrounding vegetation did not shade the panels. We
selected a reference image in each block i (i ∈ 1, ..., 11) and normalized the rest of the images j in
the block to this image using the relative scaling factor arel_vi_j. The reference image was selected so
that its illumination conditions represented well the conditions of the majority of the data set. The
a priori values for the relative correction parameters arel_vi_j were derived from the Intersil ISL29004
irradiance measurements:

arel_vi_j =
irradiancevi_re f

irradiancevi_j

where irradiancevi_j and irradiancevi_ref are the irradiance measurements during the acquisition
of image j and reference image ref. The arel_vi_j values were adjusted using the radiometric block
adjustment method [49,53,56]. The relative scaling factor (svi_ref) was calculated between each block
and the reference block v06 using the Intersil irradiance values of reference images and scaled using
the exposure times of the reference block and the block i (texp_ref, texp_vi). Median filtering was used in
the irradiance values to eliminate the instability of the irradiance measurements:

svi_re f =
irradiancere f _re f × texp _re f

irradiancevi_re f × texp _vi

Thus, the final equation for the transformation from DNs to reflectance was

Re f lectance = aabs_re f svi_re f arel_vi_jDNvi_j + babs_re f

We did not apply correction for the bidirectional reflectance (BRDF) effects in the data set. BRDF
effects were nonexistent in the image blocks captured under cloudy conditions. For the data sets
captured in sunny conditions, the maximum view angles to the object of different blocks were 2.6◦–4.8◦

in the flight direction and 3.9◦–9.4◦ in the cross-flight direction at the top of the canopy; thus, the
BRDF effects could be assumed to be insignificant. The radiometric model parameters were calculated
separately for each band.

2.3.3. Mosaic Calculation

Hyperspectral orthophoto mosaics were calculated with 10 cm GSD from the FPI images using
the FGI’s in-house mosaicking software. Mosaics were calculated for each band separately and then
combined to form a uniform hyperspectral data cube over each block area. The radiometric processing
described above (absolute calibration and relative normalizations within and between image blocks)


Remote Sens. 2017, 9, 185 10 of 34

were used for the FPI images. The RGB mosaics were calculated using the PhotoScan mosaicking
module with a 5 cm GSD.

2.4. Workflow for Individual Tree Detection and Classification

The final derived products from the UAV data that were used in this study are:

• Photogrammetric point cloud with a 10 cm point interval computed from the RGB data;
• RGB orthomosaic with 5 cm GSD (used for visual evaluation for the reference data and

classification of individual trees);
• FPI orthomosaic with 33 spectral bands and 10 cm GSD.

Figure 4 presents the workflow for training and validating the classifier using the reference data
and detecting the individual trees and classifying them using the trained classifier. The reference data
collected from the field is used to create the classification model. The classification model utilizing
both spectral and structural features was tested. The spectral features were computed from the
FPI mosaics using Matlab (Matlab R2015b, The MathWorks, Inc., Natick, MA, USA) and structural
features from the photogrammetric point clouds using LAStools software (LAStools, version 160703,
rapidlasso GmbH). Feature selection was tested to improve classification performance and to evaluate
the significance of various features. The feature selection and classification training and testing were
done using Weka software (Weka 3.8, University of Waikato). The individual trees were detected from
the photogrammetric point clouds using FUSION software (FUSION version 3.50, US Department
of Agriculture, Forest Service, Pacific Northwest Research Station). The classification features for
the detected trees were derived using the same methods as with the reference data. In the end, the
classification model created using the reference data was used to predict the classes of the detected
trees. A detailed description of the steps in this workflow is given in the following sections.

Remote Sens. 2017, 9, 185  10 of 33 

 
Figure 4 presents the workflow for training and validating the classifier using the reference data 
and detecting the individual trees and classifying them using the trained classifier. The reference 
data collected from the field is used to create the classification model. The classification model 
utilizing both spectral and structural features was tested. The spectral features were computed from 
the FPI mosaics using Matlab (Matlab R2015b, The MathWorks, Inc., Natick, MA, USA) and 
structural features from the photogrammetric point clouds using LAStools software (LAStools, 
version 160703, rapidlasso GmbH). Feature selection was tested to improve classification 
performance and to evaluate the significance of various features. The feature selection and 
classification training and testing were done using Weka software (Weka 3.8, University of 
Waikato). The individual trees were detected from the photogrammetric point clouds using FUSION 
software (FUSION version 3.50, US Department of Agriculture, Forest Service, Pacific Northwest 
Research Station). The classification features for the detected trees were derived using the same 
methods as with the reference data. In the end, the classification model created using the reference 
data was used to predict the classes of the detected trees. A detailed description of the steps in this 
workflow is given in the following sections. 

 
Figure 4. Workflow for individual tree detection and classification procedure. 

2.4.1. Spectral Features 

The FPI hyperspectral data cube consisted of 33 different spectral bands. The spectral features 
for classification were derived from the FPI orthomosaic by selecting pixels within a circular area 
around the tree center using a one-meter circle radius. Several different spectral features were 
computed, which are summarized in Table 6. Most of the features are mean and median spectra 
calculated from differently selected pixel sets. Two of the features, namely MeanSpectraNormalized 
and ContinuumRemovedSpectra, are spectral features that normalize the spectra which aims at 
reducing the effects of shadowing and illumination differences. The feature 
MeanSpectraNormalized normalizes the spectra by dividing each spectral band by the sum of all 
bands. The ContinuumRemovedSpectra is a feature where the spectra is normalized by dividing the 
spectrum by a convex hull (i.e., continuum) over the spectrum [57]. 

Figure 4. Workflow for individual tree detection and classification procedure.

2.4.1. Spectral Features

The FPI hyperspectral data cube consisted of 33 different spectral bands. The spectral features for
classification were derived from the FPI orthomosaic by selecting pixels within a circular area around


Remote Sens. 2017, 9, 185 11 of 34

the tree center using a one-meter circle radius. Several different spectral features were computed,
which are summarized in Table 6. Most of the features are mean and median spectra calculated
from differently selected pixel sets. Two of the features, namely MeanSpectraNormalized and
ContinuumRemovedSpectra, are spectral features that normalize the spectra which aims at reducing
the effects of shadowing and illumination differences. The feature MeanSpectraNormalized normalizes
the spectra by dividing each spectral band by the sum of all bands. The ContinuumRemovedSpectra is
a feature where the spectra is normalized by dividing the spectrum by a convex hull (i.e., continuum)
over the spectrum [57].

Table 6. Spectral features derived from the FPI data cubes. Each spectrum includes 33 spectral bands.

Feature Name Description

MeanSpectra Mean spectra of all pixels
MedianSpectra Median spectra of all pixels

MeanSpectraBright Mean spectra of pixels brighter than the average of all pixels
MedianSpectraBright Median spectra of pixels brighter than the average of all pixels

MeanSpectraDark Mean spectra of pixels darker than the average of all pixels
MedianSpectraDark Median spectra of pixels darker than the average of all pixels
MeanSpectra6Max Mean spectra of six brightest pixels

MedianSpectra6Max Median spectra of six brightest pixels
MeanSpectraNormalized Mean spectra of all pixels where each spectral band is divided by the sum of all bands

ContinuumRemovedSpectra Continuum removed spectra computed using code by [58]

2.4.2. Structural Features

The 3D structural features were computed from the RGB point clouds with 0.1 m GSD and using
algorithms provided by LAStools software. The computed structural parameters are explained in
Table 7. The detailed workflow for computing the structural features is described in Appendix A.

Table 7. Structural features derived from the photogrammetric point clouds.

Feature Name Description

Min Normalized minimum height

Avg Normalized average height

Std Normalized deviation of height

Ske Skewness (i.e., measure of asymmetry) of point height distribution [59]

Kur Kurtosis (i.e., “tailedness”) of point height distribution [59]

Cov Canopy cover computed as the number of points above breast height
(i.e., 1.37 m) divided by the total number of points

Percentiles N%, where
N = 5, 15, 25, 50, 75, 90 Normalized height percentiles. Height below which N percent of points lay

Bicentiles, N%, where
N = 50, 70, 80, 90, 95 Percentage of the points whose heights are below N% of the maximum height

In order to avoid the creation of features that are biased to classify trees based on their height
and not the point-height distribution, all tree metrics that are directly proportional to the tree height
were normalized by dividing the feature value by the maximum height of the tree. Without this
normalization, the Larch trees in our study area, for example, could be correctly classified only based
on their height since the Larch samples were all from the same test site with a low variation in tree
heights. This would decrease the generalization of the classification. Due to the same reason, the
maximum height was not used as a classification feature.


Remote Sens. 2017, 9, 185 12 of 34

2.4.3. Feature Selection

The total number of spectral and structural features was 347. In order to find good features with a
high prediction ability, feature selection and its effect on the classification were also investigated.

The feature selection was performed using Weka 3.8 software. The derived features included
many features that had a high possibility to be highly correlated between each other, such as different
spectral features at the same spectral band; feature selection methods that take into account the
correlation between the individual features were used. The chosen method was Correlation-based
Feature Selection (CFS) [60] implemented for evaluating a subset of features (CfsSubsetEvaluation
in Weka). This method strives for subsets that correlate highly with the class value and have a
low internal correlation between single features. The method evaluates the worth of a subset of
attributes by considering the individual predictive ability of each feature along with the degree of
redundancy between them. This is beneficial especially with spectral data from vegetation where
spectral features are usually characterized better by a combination of different spectral bands rather
than using information from only one spectral band. The use of vegetation indices is also based on this.
Vegetation indices (VIs) have been used especially with sensors having a limited number of spectral
bands (less than 10). The FPI camera provides 33 spectral band data and a good classification method
should extract the same significant information from this data that could possibly be obtained by
calculating VIs.

The CfsSubsetEvaluation method can be used with different search methods for feature subset
evaluation. The feature selection method was performed with three different search methods:

• GeneticSearch: Performs a search using the simple genetic algorithm [61].
• BestFirst: Searches the space of attribute subsets by greedy hill climbing augmented with a

backtracking facility.
• GreedyStepWise: Performs a greedy forward or backward search through the space of

attribute subsets.

The three feature selection methods were performed using 10-fold cross validation. The features
selected at least eight times out of the ten runs were chosen as the best features by each method. The
final feature set included features that were among the best features using any two of the three feature
selection methods. Feature selection was performed with and without normalizing spectral features
(i.e., MeanSpectraNormalized and ContinuumRemovedSpectra) to evaluate their benefits compared to
unnormalizing spectral features.

2.4.4. Classification Methods

The classification was performed using Weka software. Several different types of classification
algorithms were tested in the classification, including:

• Instance-based: k-Nearest Neighbors (k-NN, with k-values of 1, 3, 7) [62]
• Simple Bayes Methods: Naive Bayes [63]
• Decision Trees: C 4.5 (J48 in Weka) [64]
• Non-linear methods: Multilayer Perceptron (MLP)
• Ensemble method: Random Forest [65]

The default parameters of the software were used in the classifiers. The parameters that were
used with each classification method are described in Appendix B.

2.4.5. Classification Performance Evaluation

Since the reference data was slightly imbalanced (See Section 3.2), several evaluation metrics that
describe classification performance and generality were computed. The general accuracy and reliability


Remote Sens. 2017, 9, 185 13 of 34

of the classification model is evaluated with overall accuracy and Cohen’s Kappa. Species-specific
classification performance was evaluated with precision, recall, F-score, and confusion matrices.

Overall accuracy is the total number of correctly classified samples (i.e., sum of true positives for
all classes) divided by the total number of samples. Recall (i.e., User Accuracy) is the proportion of
trees which were classified to a specific class among all trees that truly belong to that class. Recall is
also known as the completeness.

Recall =
tp

tp + f n
,

where tp is the number of samples predicted positive that are actually positive (True positive), and fn
is the number of samples predicted negative that are actually positive (False negative).

Precision (i.e., exactness or producer accuracy) is the proportion of the examples that truly belong
to a specific class among all those classified as that specific class.

Precision =
tp

tp + f p
,

where fp is the number of samples predicted positive that are actually negative (False positive).
F-score (or F-measure) is the harmonic mean of recall and precision and thus it expresses the

balance between the recall and precision.

F = 2× Recall × Precision
Recall + Precision

The Cohen’s Kappa statistic measures the agreement of prediction with the true class. It is a
metric that compares an observed accuracy with an expected accuracy (i.e., it takes into account the
random chance of classifying correctly).

Kappa =
Observed accuracy− Expected accuracy

1− Expected accuracy
,

Observed accuracy = Overall accuracy =
tp
N

,

Expected accuracy = ∑k
i=1

nti
N
× nci

N
,

where k is the number of classes, nti is the number of samples truly in class i, nci is the number of
samples classified to class i and N is the total number of samples in all classes.

The validation of the classification performance was done using leave-one-out cross-validation
(LOOCV). This method is a special case of k-fold cross-validation where the number of folds is equal
to the number of samples. Thus, each sample is used once for testing while other samples are used for
training. The final classification model used for prediction is a model trained using all samples.

The classification performance using different feature sets was also evaluated for importance
of spectral and structural features. This evaluation was only done using relatively fast classification
methods, since using the full feature set (for example, with the MLP and LOOCV methods) is
extremely time-consuming.

2.4.6. Individual Tree Detection

The individual trees were detected from the photogrammetric point clouds with 0.1 m GSD. The
tree detection was done using the FUSION software. The summarized workflow for the tree detection
goes as follows:

1. Convert ASCII-xyz file to LAS format
2. Point cloud thinning
3. Point cloud to Digital Surface Model (DSM) using Triangulated Irregular Network (TIN) model


Remote Sens. 2017, 9, 185 14 of 34

4. DSM to Canopy Height Model (CHM) utilizing Digitral Terrain Model (DTM) computed from
NLS ALS data.

5. Tree detection from the CHM using a local maxima-based method developed in [66,67].
6. The final output was a CSV file with the following information: tree location, tree height and

crown width, and minimum and maximum height.

The parameters at 2–5 were optimized by visually evaluating the resulting tree maps over the
orthomosaics. These parameters relate to how much spatial smoothing and sampling is used in
producing the CHM and detecting trees from it. These are parameters that depend on the forest type
(i.e., typical crown shapes and canopy density) and photogrammetric point cloud smoothness. The
best compromise between false positive and true negative was the goal of the visual evaluation and
parameter optimization. The detailed description of this workflow is given in Appendix C.

2.4.7. Evaluation of Individual Tree Detection and Classification Performance

To demonstrate the ability of UAV-based photogrammetric point clouds and hyperspectral
imaging to detect and classify individual trees, the best performing classifier was used to classify all
trees detected from the test sites.

The accuracy of the tree detection and classification was estimated by comparing the reference
data with the detected and classified trees in each test site. The tree detection accuracy was performed
by searching for whether there is a tree detected within a specific radius to each reference tree. Different
search radiuses (1 m, 1.5 m and 2 m) were tested to find out if there is a difference between the tree
crown centers determined from images mosaics and crown centers detected from CHMs.

The classification accuracy and reliability of detected trees was estimated visually using the
resulting classified tree maps and RGB orthomosaic and by computing statistics of the probability of
the prediction given by the trained classification model.

3. Results

3.1. UAV Data Processing

Geometric processing of all the data sets was successful with the PhotoScan software. Statistics of
the orientation processing are given in Table 8. The numbers of images in the integrated blocks were
176–758. The reprojection errors were mostly 0.6–0.9 pixels. For block v07, the reprojection errors were
higher, up to 2.5 pixels for an unknown reason. Possible explanations could be inaccuracies in the
GCPs or the poor determinability of the IOPs and EOPs due to the small size of this block. However,
the output block quality appeared to be good when comparing the point cloud generated from this
block to the ALS reference point cloud.

Table 8. Results of orientation processing: numbers of images (N images), reprojection errors, and
point density in points/m2.

Block N GCP
RGB and FPI RGB Point Density

Points/m2N Images Reproj. Error (pix) N Images Reproj. Error (pix)

v01 7 714 0.70 291 0.84 484
v02 4 469 0.64 176 0.73 555

v0304 9 758 0.60 281 0.70 711
v05 5 717 0.68 292 0.89 601
v06 3 193 0.91 76 1.10 510
v07 4 176 1.63 68 2.56 524
v08 5 421 0.73 178 0.99 538
v09 4 280 0.57 109 0.66 621
v10 3 182 0.65 70 0.88 484
v11 4 469 0.478 469 0.478 833


Remote Sens. 2017, 9, 185 15 of 34

The image-matching using the RGB images provided dense photogrammetric point clouds
with 500–700 points per m2, and with an approximate 5 cm point distance. The comparison
between the photogrammetric and ALS-interpolated surfaces was made visually using profile plotting
and difference surfaces. This method was used to provide a rough quality assessment of the
processing. A more accurate assessment was not possible because specific height reference data
was not implemented in the target area. For each separate area, ten profiles with all points from
the photogrammetric DSM and ALS point cloud in ±1 m width along the profile were computed in
South–North and East–West directions. Examples of profiles in test site v05 are given in Figure 5, and
an example of a photogrammetric point cloud of test site v05 is presented in Figure 6. The profile
plots revealed that the photogrammetric surfaces generally followed the structure that was visible
in the ALS point cloud: individual trees seemed well-aligned and minor differences in canopy top
level and ground level were observable. As anticipated, the surface generated by photogrammetric
techniques did not reach the ground level as often as ALS. It was noteworthy that in gaps where
the photogrammetric surface reached the ground, the ground-level estimate was mostly below the
ALS ground. Subtracting photogrammetric DSM from the interpolated ALS reference DSM created
a difference surface. A slight mismatch in the xy-direction resulted in large differences in the DSM
difference; thus, the comparison worked best with test sites with flat surfaces such as fields or roads.
There was a slight tilt in the photogrammetric surface in some test sites (block v06 ±3 m, block v09)
or curvature in which the middle of the area was lower and the edges higher than the ALS reference
(block v05 ±1 m, v07, v11). The most probable causes for the slight systematic deformations were
uncertainties in IOPs and EOPs and potential inaccuracies in GCPs. Furthermore, in many cases, the
point intersection angles when matching points in small openings were very narrow, which reduced
the accuracy of the derived points. However, the quality of georeferencing could be considered good
and appropriate for this investigation.

The results of the block adjustments were utilized in the registration of the all the bands to the
reference band. The registration process was successful, and only minor misalignments could be
observed between the bands in some cases.

A visual assessment of mosaics after radiometric processing indicated that the radiometry
uniformity was sufficient in most of the cases (Figure 7) and suitable reflectance values were obtained
from individual image blocks. The most challenging test sites were v0304, v05, and v11, which all had
a variable illumination. The radiometric normalization method did not provide sufficient uniformity
for the test site v0304, so it was left out of the analysis. In the other two test sites (v05 and v11),
the processing eliminated the major radiometric differences. The resulting spectra are discussed in
Section 3.3.

Remote Sens. 2017, 9, 185  15 of 33 

 
narrow, which reduced the accuracy of the derived points. However, the quality of georeferencing 
could be considered good and appropriate for this investigation. 

 
Figure 5. Comparison of photogrammetric and airborne laser scanning (ALS) point cloud profiles in 
test sites v01 and v05. 

 
Figure 6. Nadir (left) and oblique (right) views of test site v02 RGB point cloud colored by height. 
Green indicates lower height and red indicates higher height. Nadir and oblique views. 

The results of the block adjustments were utilized in the registration of the all the bands to the 
reference band. The registration process was successful, and only minor misalignments could be 
observed between the bands in some cases. 

A visual assessment of mosaics after radiometric processing indicated that the radiometry 
uniformity was sufficient in most of the cases (Figure 7) and suitable reflectance values were 
obtained from individual image blocks. The most challenging test sites were v0304, v05, and v11, 
which all had a variable illumination. The radiometric normalization method did not provide 
sufficient uniformity for the test site v0304, so it was left out of the analysis. In the other two test sites 
(v05 and v11), the processing eliminated the major radiometric differences. The resulting spectra are 
discussed in Section 3.3. 

Figure 5. Comparison of photogrammetric and airborne laser scanning (ALS) point cloud profiles in
test sites v01 and v05.


Remote Sens. 2017, 9, 185 16 of 34

Remote Sens. 2017, 9, 185  15 of 33 

 
narrow, which reduced the accuracy of the derived points. However, the quality of georeferencing 
could be considered good and appropriate for this investigation. 

 
Figure 5. Comparison of photogrammetric and airborne laser scanning (ALS) point cloud profiles in 
test sites v01 and v05. 

 
Figure 6. Nadir (left) and oblique (right) views of test site v02 RGB point cloud colored by height. 
Green indicates lower height and red indicates higher height. Nadir and oblique views. 

The results of the block adjustments were utilized in the registration of the all the bands to the 
reference band. The registration process was successful, and only minor misalignments could be 
observed between the bands in some cases. 

A visual assessment of mosaics after radiometric processing indicated that the radiometry 
uniformity was sufficient in most of the cases (Figure 7) and suitable reflectance values were 
obtained from individual image blocks. The most challenging test sites were v0304, v05, and v11, 
which all had a variable illumination. The radiometric normalization method did not provide 
sufficient uniformity for the test site v0304, so it was left out of the analysis. In the other two test sites 
(v05 and v11), the processing eliminated the major radiometric differences. The resulting spectra are 
discussed in Section 3.3. 

Figure 6. Nadir (left) and oblique (right) views of test site v02 RGB point cloud colored by height.
Green indicates lower height and red indicates higher height. Nadir and oblique views.

Remote Sens. 2017, 9, 185  16 of 33 

 
Figure 7. Hyperspectral image mosaics of the test sites. The spectral values have been optimized for 
visual purposes, and the scaling is not the same for all mosaics. 

3.2. Reference Data Processing 

Due to inaccuracies in the field data collection and the RGB and FPI mosaics, there were notable 
misalignments between the field data tree locations and the trees visually observed from the mosaics 
as discussed in Section 2.1. The misalignment was worst in the areas with high tree density. In some 
areas, the misalignment was systematic and the data could be aligned just by moving all the 
reference trees in that specific area to some direction. However, in some areas, the misalignment was 
not systematic, therefore every tree had to be moved one-by-one to the correct position if the correct 
position and species of the tree could be identified from the orthomosaic. Some test sites included 
several tree species at different canopy heights. In such areas, only the highest trees could be 
identified.  

If reference data had to be moved, particular attention was given to moving the reference data 
to the matching species. In most cases, it was possible to identify the tree species from the RGB 
orthomosaic. However, at some areas with high tree density and multiple tree species, it was not 
possible to identify the correct trees, and thus the reference data had to be omitted. Reference data 
for the test site v0304 could not be aligned properly, and thus it was omitted completely from the 
analysis. 

Although it was possible to fix the locations of the misaligned reference trees to match the 
correct tree species in the mosaics, it was not always possible to be certain that the reference tree was 

Figure 7. Hyperspectral image mosaics of the test sites. The spectral values have been optimized for
visual purposes, and the scaling is not the same for all mosaics.


Remote Sens. 2017, 9, 185 17 of 34

3.2. Reference Data Processing

Due to inaccuracies in the field data collection and the RGB and FPI mosaics, there were notable
misalignments between the field data tree locations and the trees visually observed from the mosaics as
discussed in Section 2.1. The misalignment was worst in the areas with high tree density. In some areas,
the misalignment was systematic and the data could be aligned just by moving all the reference trees
in that specific area to some direction. However, in some areas, the misalignment was not systematic,
therefore every tree had to be moved one-by-one to the correct position if the correct position and
species of the tree could be identified from the orthomosaic. Some test sites included several tree
species at different canopy heights. In such areas, only the highest trees could be identified.

If reference data had to be moved, particular attention was given to moving the reference data
to the matching species. In most cases, it was possible to identify the tree species from the RGB
orthomosaic. However, at some areas with high tree density and multiple tree species, it was not
possible to identify the correct trees, and thus the reference data had to be omitted. Reference data for
the test site v0304 could not be aligned properly, and thus it was omitted completely from the analysis.

Although it was possible to fix the locations of the misaligned reference trees to match the correct
tree species in the mosaics, it was not always possible to be certain that the reference tree was exactly the
correct tree in the orthomosaics. This means that other reference data (such as height and DBH) might
be incorrect, and thus could not be utilized in this study. Over half of the original data (approximately
8800 trees) had to be omitted from the analysis, since it could not be aligned to correct tree species
reliably. The final reference data included 4151 trees from nine different test sites. The proportion of
different tree species and their distribution to different test sites are summarized in Table 9.

Table 9. The final reference tree data. Reference data for the test site v0304 could not be aligned
properly, and thus it was omitted from the analysis.

Test Site Pine Spruce Birch Larch Total Number of Reference Trees

v01 1769 116 0 0 1885
v02 0 24 525 0 549
v05 540 80 25 0 645
v06 1 179 0 0 180
v07 0 102 2 0 104
v08 62 40 2 0 104
v09 114 259 25 0 398
v10 141 0 1 0 142
v11 0 22 0 122 144

Total number of
reference trees 2627 822 580 122 4151

3.3. Feature Selection

The most significant features according to the feature selection methods are summarized in
Table 10. The first seven features of the full feature set with normalizing spectral features (i.e.,
MeanSpectraNormalized and ContinuumRemovedSpectra) were always included in the final feature
set regardless of the method used. The last fifteen features were among the best predicting features
using two of the three methods. If the normalizing spectral features were omitted from the analysis, the
features included 14 features, from which the first three were selected by each feature selection method.

Most of the significant features were spectral features. Examples of the mean and median spectra
and the effect of normalization can be seen in Figures 8–10. When considering the average spectra of
species calculated from all the test sites, the differences between the spruce and pine were the smallest.
The differences in the birch and larch were larger than the other two (Figure 8). The mean and median
spectra are almost equal which indicates that the species-specific spectral values are quite uniformly
distributed and there are not any significant outliers. When comparing the spectral signatures of
the species in different test sites, the differences were less than 25% in most cases (Figure 9). These


Remote Sens. 2017, 9, 185 18 of 34

differences were caused by the uncertainties in the relative calibration procedure, but they were
partially also due to the natural variability within each species. The spruce spectrum of the test site
v11 is clearly brighter than with the other test sites. The normalization slightly reduced the differences
but notable differences were still present after normalization. The test site v11 had changes in the
illumination during the data capture and the radiometric correction was not able to remove all the
illumination changes which affected the radiometric quality of the final data.

Table 10. The best performing features selected by the feature selection methods, with and without
normalized spectral features.

Feature Selection with Normalized Spectral Features Feature Selection without Normalized Spectral Features

MeanSpectraNormalized_515nm MeanSpectra_820
MeanSpectraNormalized_529nm MedianSpectra_711
MeanSpectraNormalized_606nm MedianSpectra6Max_820
MeanSpectraNormalized_657nm MeanSpectra_806

ContinummRemovedSpectra_764nm MeanSpectraDark_657
b90 MedianSpectraDark_657
b95 MeanSpectra6Max_806

MeanSpectra_820nm MeanSpectra6Max_820
MedianSpectra_711nm MedianSpectra6Max_806

MedianSpectraDark_657nm Min
MeanSpectraNormalized_508nm p90
MeanSpectraNormalized_510nm b90
MeanSpectraNormalized_521nm b95
MeanSpectraNormalized_590nm cov
MeanSpectraNormalized_599nm
MeanSpectraNormalized_644nm
MeanSpectraNormalized_670nm
MeanSpectraNormalized_718nm
MeanSpectraNormalized_806nm

ContinummRemovedSpectra_644nm
ContinummRemovedSpectra_657nm

Min

Remote Sens. 2017, 9, 185  18 of 33 

 
Most of the significant features were spectral features. Examples of the mean and median 
spectra and the effect of normalization can be seen in Figures 8–10. When considering the average 
spectra of species calculated from all the test sites, the differences between the spruce and pine were 
the smallest. The differences in the birch and larch were larger than the other two (Figure 8). The 
mean and median spectra are almost equal which indicates that the species-specific spectral values 
are quite uniformly distributed and there are not any significant outliers. When comparing the 
spectral signatures of the species in different test sites, the differences were less than 25% in most 
cases (Figure 9). These differences were caused by the uncertainties in the relative calibration 
procedure, but they were partially also due to the natural variability within each species. The spruce 
spectrum of the test site v11 is clearly brighter than with the other test sites. The normalization 
slightly reduced the differences but notable differences were still present after normalization. The 
test site v11 had changes in the illumination during the data capture and the radiometric correction 
was not able to remove all the illumination changes which affected the radiometric quality of the 
final data. 

 
Figure 8. Mean, Median and MeanNormalized spectra of different species averaged over all test sites 
(top row) and the same spectra normalized to Pine spectra (bottom row). Test sites with less than 10 
tree samples have been omitted from the plot. The error bars present the standard error. 

When the full feature set was used, most of the significant spectral features were normalized 
spectral features. Normalizing the spectra reduces the effects of shadowing and illumination 
differences, which is probably the reason why they performed best with this data, since some of the 
data had been collected in different flights and time of day with varying illumination conditions. 
Illumination and shadowing differences are reduced, especially when the illumination conditions 
are specular, but it also reduces the possible scale differences with data collected on different 
occasions in diffuse illumination conditions. The relative spectral differences were reduced, 
especially at the near infra-red wavelengths where changes in view angle and tree structure have the 
strongest impact. At the same time, the differences at visible wavelengths are still present, and thus 
most of the best features are at visible wavelengths, but few NIR features are also present.  

Figure 8. Mean, Median and MeanNormalized spectra of different species averaged over all test sites
(top row) and the same spectra normalized to Pine spectra (bottom row). Test sites with less than
10 tree samples have been omitted from the plot. The error bars present the standard error.


Remote Sens. 2017, 9, 185 19 of 34Remote Sens. 2017, 9, 185  19 of 33 

 
Figure 9. Mean spectra of different species at different test sites (top row) and their relative 
differences (bottom row). In the relative differences, Pine spectra have been normalized to test site 
v01, Spruce to v06, and Birch to v02. The Pine and Birch spectra normalized to areas which had most 
tree samples. Spruces were normalized to area v06 which was the spectral reference area in this 
study. The error bars present the standard error. 

 
Figure 10. Normalized mean spectra of different species at different test sites (top row) and their 
relative differences (bottom row). In the relative differences, Pine spectra have been normalized to 
test site v01, Spruce to v06 and Birch to v02. The error bars present the standard error. 

Figure 9. Mean spectra of different species at different test sites (top row) and their relative differences
(bottom row). In the relative differences, Pine spectra have been normalized to test site v01, Spruce to
v06, and Birch to v02. The Pine and Birch spectra normalized to areas which had most tree samples.
Spruces were normalized to area v06 which was the spectral reference area in this study. The error bars
present the standard error.

Remote Sens. 2017, 9, 185  19 of 33 

 
Figure 9. Mean spectra of different species at different test sites (top row) and their relative 
differences (bottom row). In the relative differences, Pine spectra have been normalized to test site 
v01, Spruce to v06, and Birch to v02. The Pine and Birch spectra normalized to areas which had most 
tree samples. Spruces were normalized to area v06 which was the spectral reference area in this 
study. The error bars present the standard error. 

 
Figure 10. Normalized mean spectra of different species at different test sites (top row) and their 
relative differences (bottom row). In the relative differences, Pine spectra have been normalized to 
test site v01, Spruce to v06 and Birch to v02. The error bars present the standard error. 

Figure 10. Normalized mean spectra of different species at different test sites (top row) and their
relative differences (bottom row). In the relative differences, Pine spectra have been normalized to test
site v01, Spruce to v06 and Birch to v02. The error bars present the standard error.


Remote Sens. 2017, 9, 185 20 of 34

When the full feature set was used, most of the significant spectral features were normalized
spectral features. Normalizing the spectra reduces the effects of shadowing and illumination
differences, which is probably the reason why they performed best with this data, since some of
the data had been collected in different flights and time of day with varying illumination conditions.
Illumination and shadowing differences are reduced, especially when the illumination conditions are
specular, but it also reduces the possible scale differences with data collected on different occasions
in diffuse illumination conditions. The relative spectral differences were reduced, especially at the
near infra-red wavelengths where changes in view angle and tree structure have the strongest impact.
At the same time, the differences at visible wavelengths are still present, and thus most of the best
features are at visible wavelengths, but few NIR features are also present.

If the normalizing spectral features were omitted from the analysis, the final feature set included
more NIR features and only features in the red band (i.e., 657 nm) from the visible spectral region.
Without the normalization, the relative differences in the different species are higher in the NIR region
and smaller in the visible wavelength region, and thus the NIR features without normalization were
the best predictors. In addition, a few more 3D structural features were included in the final feature set
with the feature set without normalizing spectral features.

The most significant 3D structural features were mostly those features at the canopy top layers,
which was an expected result since passive optical imaging does not have good penetration ability.
The minimum height normalized by the maximum height performed well, as it provides a good
estimate of the canopy extent downwards, which provides species-specific information (for example,
between pines and spruces), since the canopy of spruces extends lower than pines and birches.
However, it must be noted that with photogrammetrically-produced point clouds, the penetration
ability is also highly dependent on the density of the forest (gaps between trees), which is not always a
species-specific property.

3.4. Classification

The classification was performed using various feature configurations, including all features, all
features without normalizing spectral features and mean spectra only. The evaluation of different
classification methods for feature-selected datasets and mean spectra only are summarized in Table 11.

Table 11. Classification accuracies and Kappa values of tested classification methods using
feature-selected datasets and mean spectra only.

Algorithm
Feature Selected-All

Features
Feature Selected-without

Normalizing Spectral Features Mean Spectra Only

Acc. Kappa Acc. Kappa Acc. Kappa

C4.5 91.4 0.84 90.7 0.83 89.5 0.80
RandomForest 94.9 0.90 92.6 0.86 93.3 0.87

k-NN
k = 1 93.1 0.87 89.5 0.80 92.3 0.86
k = 3 93.8 0.88 90.8 0.82 93.5 0.88
k = 7 93.3 0.87 91.2 0.83 93.0 0.87
MLP 95.2 0.91 92.4 0.86 95.4 0.91

Naive Bayes 87.1 0.77 83.3 0.70 70.7 0.49

Abbreviations: k-NN, k-Nearest Neighbors; MLP, Multilayer Perceptron.

The best performing classification algorithms were MLP and RandomForest, which provided
94.9% and 95.2% overall accuracies for feature-selected dataset with all features, respectively. None of
the classification algorithms performed exceptionally poorly in the classification, but the NaiveBayes
method did not provide good results (87.1%). The NaiveBayes method was probably most affected by
the imbalance data, which resulted in poorer results. The classification results with feature selection
and using a feature set without the normalizing spectral features produced slightly worse classification


Remote Sens. 2017, 9, 185 21 of 34

results compared to those achieved with the normalizing features. Without the normalizing spectral
features, Random Forest provided the best result with 93% overall accuracy.

The classification was also tested using only the mean spectra. The results indicate that using
only the mean spectra, good classification results can be achieved. The accuracy of classification using
only the mean spectra and MLP method was 95.2%. The accuracy of the other classification methods
differed by approximately 1% from those achieved with the full feature set with feature selection,
except the NaiveBayes method which produced only 70.7% accuracy.

The evaluation of classification performance with different feature sets using k-NN and
RandomForest classifiers are presented in Tables 12 and 13. The best performance was achieved
when using all 347 features. The accuracies were 95.5% for k-NN and 95.1% for RandomForest. When
all features were used, the feature selection did not improve the classification accuracy, but neither
did it drastically worsen the results. Feature reduction based on the feature selection significantly
improved the learning time of the classification, especially with more complex methods such as MLP.
Good performance was also achieved by using only the spectral features (94.7% and 94.9% for Random
Forest and k-NN, respectively). Random Forest and k-NN (k = 3) achieved 94.8% and 94.9% accuracies,
respectively, when all the features were used with the feature set without normalizing spectral features,
which was less than 1% worse than with the full feature set. The worst results were obtained when
only the structural features were used in classification. The Kappa values are slightly smaller than the
accuracy, which indicates a small effect of the imbalanced data.

Table 12. Effect of different feature sets to classification using the k-NN method with k = 3.

Metric Name All
Features

Spectral
Features

Feature
Selection

All Features
(No Norm.

Spectra)

Spectral
Features (No

Norm. Spectra)

Feature
Selection (No

Norm. Spectra)

Structural
Features

Accuracy (%) 95.5 94.9 93.8 94.9 94.7 90.8 68.4
Kappa 0.92 0.90 0.88 0.90 0.90 0.82 0.37

Table 13. Effect of different feature sets to classification using RandomForest.

Metric Name All
Features

Spectral
Features

Feature
Selection

All Features
(No Norm.

Spectra)

Spectral
Features (No

Norm. Spectra)

Feature
Selection (No

Norm. Spectra)

Structural
Features

Accuracy (%) 95.1 94.7 94.9 94.8 94.3 92.6 72.0
Kappa 0.91 0.90 0.90 0.90 0.89 0.86 0.39

The species-specific results for the k-NN, RandomForest, and MLP methods with a feature-selected
full feature set are presented in confusion matrices in Tables 14–17. As expected, the most errors
in classification occurred among the classification of pines and spruces. The most frequent error
was spruces classified as pines, which might be a result of the imbalance in the data (most of the
training data are pines). Birches could be classified with a high recall and precision (both over 95%)
since they have a strong spectral difference compared to coniferous species, especially in the NIR
channels (Figure 8). When using normalized features, the best balance between recall and precision is
obtained with the RandomForest and MLP which both had an F-score of 0.93. The confusion matrix
for RandomForest using a feature set without normalizing spectral features (Table 17) and feature
selection indicates that most errors in this case occur due to the poor true positive classification rate of
Larch trees.


Remote Sens. 2017, 9, 185 22 of 34

Table 14. Confusion matrix of classification using the k-NN method with the k = 3 with a
feature-selected data set. F-score presents the mean F-score for all species.

F-Score: 0.91
Classified as Recall

Pine Spruce Birch Larch

True Class

Pine 2583 43 0 1 0.983
Spruce 152 660 3 7 0.803
Birch 13 9 553 5 0.953
Larch 17 4 3 98 0.803

Precision 0.934 0.922 0.989 0.883

Table 15. Confusion matrix of classification using RandomForest with a feature-selected data set
selected from all features. F-score presents the mean F-score for all species.

F-Score: 0.93
Classified as Recall

Pine Spruce Birch Larch

True Class

Pine 2584 41 0 2 0.984
Spruce 122 692 2 6 0.842
Birch 13 8 558 1 0.962
Larch 11 1 5 105 0.861

Precision 0.947 0.933 0.988 0.921

Table 16. Confusion matrix of classification using MLP with a feature-selected data set selected from
all features. F-score presents the mean F-score for all species.

F-Score: 0.93
Classified as Recall

Pine Spruce Birch Larch

True Class

Pine 2564 57 1 5 0.976
Spruce 89 718 5 10 0.873
Birch 11 5 562 2 0.969
Larch 6 5 5 106 0.869

Precision 0.960 0.915 0.981 0.862

Table 17. Confusion matrix of classification using RandomForest with a feature-selected data set
without normalized spectral features. F-score presents the mean F-score for all species.

F-Score: 0.85
Classified as Recall

Pine Spruce Birch Larch

True Class

Pine 2555 67 0 5 0.973
Spruce 130 680 9 3 0.827
Birch 8 21 535 16 0.922
Larch 17 12 20 73 0.598

Precision 0.943 0.872 0.949 0.753

3.5. Individual Tree Detection and Species Prediction

The individual tree detection was evaluated visually and by using the reference data. A visual
examination of the located trees and orthomosaics indicated that most of the trees in the test sites
could be found. The most notable areas where the position of the detected trees and the trees visually
observed from the orthomosaic differed were areas where the point clouds were incomplete due to
matching failures or occlusions (for example, due to strong shadows or insufficient overlaps), and
thus had local inaccuracies in the 3D point cloud and CHM. The most notable undetected trees were
shorter trees at lower levels of the canopy, which could be seen from the orthomosaics but could not be


Remote Sens. 2017, 9, 185 23 of 34

detected from the point clouds due to the limited ability of passive sensors to penetrate the canopy.
Some of the errors might have been induced by the fact that the reference trees were manually placed
with the help of orthomosaics. The tree top peak in the point clouds might have been at a different
location than the tree center visually detected from the orthomosaics.

The tree detection in the test sites was tested using the reference data with three different search
radiuses (Table 18). Increasing the search radius had a significant impact on the detection rate in some
of the test sites, which indicates that there was a notable difference between the tree crown centers
determined from image mosaics and automatic detection from CHM. The best results with 64%–96%
tree identification rates were obtained with the 2 m search radius. However, due to the lack of reference
data for testing, the accuracy estimation results of the tree detection are only indicative.

Table 18. Percentage of reference trees found using the individual tree detection method, and the total
number of trees detected in each test site.

Test Site
Search Radius Total No. of Trees

Detected1 m 1.5 m 2 m

v01 0.82 0.88 0.89 9723
v02 0.88 0.95 0.96 5223
v05 0.63 0.78 0.83 5942
v06 0.80 0.82 0.82 1585
v07 0.64 0.82 0.88 2023
v08 0.26 0.46 0.64 1884
v09 0.63 0.72 0.77 3191
v10 0.61 0.79 0.87 1242
v11 0.53 0.71 0.76 1611

Visual inspection of the classification of the individual trees indicated that most of the trees
were classified correctly. Most errors occurred between pines and spruces and in areas that had
a lot of shadowing (for example in area v08, which was measured under completely sunny skies).
The species-specific and area-specific prediction probabilities presented in Table 19 indicate that the
trained MLP classifier could classify most of the detected trees with a high probability (90%–97%). The
best prediction could be provided for pines (97%), and thus also in test sites with many pines. The
probability for the prediction of larch (90%) was the lowest, which also causes the v11 test site to have
lowest mean probabilities since it has many larch trees. Figure 11 presents an example of test site v05
with detected and classified trees.

Table 19. Species- and area-specific prediction probabilities using the MLP model trained with a
feature-selected data set selected from all features. N indicates the number of trees classified to a
specific species at a specific area; Mean is the mean prediction probability; Std. is the standard deviation.

Tree Species v01 v02 v05 v06 v07 v08 v09 v10 v11 All Test Sites

Pine
N 6102 593 3580 246 137 382 1029 579 415 13063

Mean 0.98 0.95 0.97 0.95 0.94 0.94 0.96 0.96 0.95 0.97
Std. 0.07 0.10 0.08 0.11 0.12 0.13 0.11 0.11 0.12 0.09

Spruce
N 2634 2376 1886 876 1623 1105 1789 308 780 13377

Mean 0.96 0.97 0.93 0.97 0.98 0.96 0.97 0.94 0.95 0.96
Std. 0.11 0.08 0.12 0.10 0.06 0.09 0.09 0.11 0.11 0.10

Birch
N 951 2237 421 451 255 246 365 325 154 5405

Mean 0.94 0.98 0.93 0.95 0.91 0.94 0.92 0.93 0.87 0.95
Std. 0.14 0.08 0.14 0.13 0.15 0.12 0.15 0.15 0.14 0.12

Larch
N 36 17 55 12 8 151 8 30 262 579

Mean 0.82 0.84 0.87 0.84 0.81 0.91 0.86 0.89 0.92 0.90
Std. 0.17 0.17 0.16 0.20 0.21 0.16 0.22 0.17 0.15 0.16

All species
N 9723 5223 5942 1585 2023 1884 3191 1242 1611 32424

Mean 0.97 0.97 0.95 0.96 0.97 0.95 0.96 0.95 0.93 0.96
Std. 0.09 0.08 0.11 0.11 0.09 0.11 0.10 0.13 0.13 0.10


Remote Sens. 2017, 9, 185 24 of 34

Remote Sens. 2017, 9, 185  23 of 33 

 
Visual inspection of the classification of the individual trees indicated that most of the trees 
were classified correctly. Most errors occurred between pines and spruces and in areas that had a lot 
of shadowing (for example in area v08, which was measured under completely sunny skies). The 
species-specific and area-specific prediction probabilities presented in Table 19 indicate that the 
trained MLP classifier could classify most of the detected trees with a high probability (90%–97%). 
The best prediction could be provided for pines (97%), and thus also in test sites with many pines. 
The probability for the prediction of larch (90%) was the lowest, which also causes the v11 test site to 
have lowest mean probabilities since it has many larch trees. Figure 11 presents an example of test 
site v05 with detected and classified trees. 

 
Figure 11. Example of detected and classified trees in the v05 test site. Blue stars are birches, yellow 
triangles are spruces, green circles are pines, and red squares are larches. 

  
Figure 11. Example of detected and classified trees in the v05 test site. Blue stars are birches, yellow
triangles are spruces, green circles are pines, and red squares are larches.

3.6. Processing Times

In the following, we present estimates of the computing times needed to process the datasets.
The given time-estimates are computing times for the research system and non-optimized computers.
In addition, the process included some interactive steps, such as measuring GCPs and reflectance
panels in images, as well as interactive quality control; these steps are expected to impact minimally
on the computation times when automated in the future.

Most of the processing time was spent on computing the image orientation and calculating the
photogrammetric point clouds which took at the minimum 3.5 h and at the maximum 11.5 h for test sites
v02 and v0304, respectively. The band-wise exterior orientation calculation took approximately 5 s for
the individual band and image, resulting in processing times of 1.5 and 6 h for the smallest and largest
blocks, respectively. The radiometric block adjustment and mosaic calculation took approximately
3–6 min per hypercube depending on the parameters and size of the area, thus resulting in 2 and 13 h
for the smallest and largest blocks, respectively. These numbers are affected by many factors, such as
the quality of approximate values, selected quality levels and complexity of the area.


Remote Sens. 2017, 9, 185 25 of 34

The spectral and structural feature extraction took approximately 2.3 h and 10 h for all test sites,
respectively. The used feature selection methods took approximately 5 min. The classification model
learning and evaluation had much variability. The fastest classification method using the feature
selected dataset was the k-NN method which could be trained and evaluated using the LOOCV
method in less than one minute. For RandomForest and MLP methods, it took 3.6 h and 12.3 h,
respectively. The complete individual tree detection workflow (i.e., from point cloud to tree locations)
took approximately 40 min for all test sites and the tree species prediction of the detected trees using
the trained MLP model took approximately 5 s for all trees.

4. Discussion

This investigation developed a small UAV hyperspectral imaging and a photogrammetry-based
remote sensing method for individual tree detection and classification in a boreal forest. The method
was assessed using, altogether, eleven forest test sites with 4151 reference trees.

4.1. Aspects of Data Processing

Datasets were captured in deep forest scene, and conditions during the data capture were
typical for the northern climate zone during summer with variable cloudiness and rain showers.
Variable conditions are challenging for analysis methods based on passive spectral information.
The conventional radiometric correction methods based on reflectance panels or radiative transfer
modeling [42–46] are not suitable in such conditions. The novel radiometric processing approach
based on radiometric block adjustment and onboard irradiance measurements provided promising
results. The reflectance panels could not be utilized in several test sites in this study because they were
too deep in the forest and shadowed by the trees. In many operational applications, the installation
of reflectance panels is likely to be challenging; thus, it is important to develop methods that are
independent of those. Important future improvements will include the enhancement of the irradiance
spectrometer to be intolerant to platform tilting as well as improving the system calibration. The
UAV system used in this study was equipped with the irradiance spectrometer but the data quality
was not sufficient for performing accurate enough reflectance correction. Future enhancements are
expected to improve the accuracy and level of automation and ultimately removing the need for
ground reflectance panels. Approaches using object characteristics could also be suitable. For example,
one possible method would be to use reflectance information of known natural targets in the scenes
(such as trees) to perform radiometric normalization between different areas whose data has been
collected in different flights. An important topic for future studies will be how different illumination
conditions (direct sun illumination and different shadow depth) should be accounted for in the data
analysis in a forested environment and which parts of it should be compensated for. For example,
Korpela et al. [15] classified spectral tree features based on the illumination condition derived from the
3D object model.

The processing and analysis were highly automated. The major interactive processes included
the deployment of geometric and radiometric in situ targets and the field reference data capture. We
expect that with future improvements in direct georeferencing performance and system radiometric
calibration, the data processing chain could be fully automated. Further considerations are needed
to automate the tree species reference data. For example, spectral libraries [68] could provide a good
reference if the data were properly calibrated. With this data set, we also used interactive analysis in
some phases of the classification, but there are possibilities to automate these steps.

A lot of different software including commercial, open and in-house was chained and much of
the software was not optimized for the material used in this study or with respect to speed. The
computing times in photogrammetric and radiometric processing were approximately 8 h to 32 h
depending on the size of the block; the combined feature extraction, selection and classification took
approximately 25 h with the MLP method, which was the slowest method. All the operations can be
highly automated, optimized and parallelized after sufficient information about the measurement


Remote Sens. 2017, 9, 185 26 of 34

problem is available, thus faster processing can be implemented in the future. In the future studies, we
will also investigate more challenging forests with a larger variability in tree species.

4.2. Consideration of Identification and Classification Results

Spectral and 3D point cloud features were used in the classification. The feature selection
results indicated that spectral features were more significant in discriminating different tree species
than the structural features derived from the photogrammetric point clouds. Based on the feature
selection and classification results, the normalizing spectral features proved to be slightly better than
non-normalizing spectral features.

The best classification results were obtained with RandomForest and MLP which both gave
95% overall accuracies and F-scores of 0.93. No previous studies were found where both spectral
and structural data measured from UAV had been used to classify individual trees. Previous
UAV-based studies have been based on only multispectral data captured using NIR-modified cameras.
Lisein et al. [41] performed classification using RandomForest and achieved an overall classification
error of 16% (out of bag error) with five different tree species. Michez et al. [39] achieved a classification
accuracy of 84% with five different tree species but the results were closer to pixel-based classification
where a single tree was not represented as a whole.

In previous studies which have not been based on UAV data, Dalponte et al. [69] have used high
resolution airborne hyperspectral imagery to classify individual trees with 93.5% overall accuracy with
three classes, namely pine, spruce and broadleaves. In addition, Lee et al. [70] have tested combined
ALS data and airborne hyperspectral imagery in tree species classification (six mainly deciduous
species) in Northern European conditions, achieving users’ accuracies between 36.4% and 100% (total
average = 61.1%) at the individual tree level. Vaughn et al. [71] used ALS data in separating five
tree species (two conifers and three deciduous) in the Pacific Northwest, U.S.A, achieving users’
accuracies between 81.5% and 95% (total average = 85.4%). An accuracy of 88%–90% was achieved by
Korpela et al. [72] in classifying Scots pine, Norway spruce, and birch using airborne LiDAR data.

Success rates in identifying individual trees based on the photogrammetric surface models with
5 cm point density were 26%–88%, 46%–95%, or 64%–96% for the search radius of 1 m, 1.5 m, or 2 m,
respectively, used in evaluating the identification of the tree. Here, the geometric consistency of ground
truth and the DSM used in the assessment of results plays an important role. The results are good
compared to previous studies by Kaartinen et al. [73] and Wang et al. [74] where similar individual tree
detection rates have been obtained for dominant trees (i.e., tallest trees in the neighborhood) using ALS
data. Individual tree detection from photogrammetric point clouds created from UAV data have been
tested, for example, by Sperclich et al. [75] who achieved 90% detection rates for 2 m search radius, but
the experiment concentrated only on a single forest plot and to the best of the authors’ knowledge,
more extensive studies have not been conducted.

The local maxima-based method was successful for the boreal forest where the trees have crowns
with a sharp shape, but it is expected that further tuning is needed to obtain successful results in forests
with less separable crowns. Here, the spectral features might provide additional information.

Tree density has a strong impact on the detection accuracy of individual trees in forests using
point cloud data [73,74]. However, the tree density estimates for our study area are based on tree
densities for individual plots within each test site. These were calculated using the original reference
data without manual reference data modification. The number of reference plots in each test site varied,
and the tree densities in the plots in the same test site also had significant variations. Thus, the tree
densities of the plots did not represent the entire test sites properly and this information could not be
used reliably to evaluate the effect of tree densities to tree detection rates in the test sites.

Various classifiers used for tree species classification include, e.g., discriminant analysis, maximum
likelihood, spectral angle mapper, support vector machine, radial basis function, neural networks,
and nearest neighbor classifiers [76]. Korpela et al. [72] have tested discriminant analysis, k-nearest
neighbor, and RandomForest estimators in tree species classification using airborne LiDAR data; the


Remote Sens. 2017, 9, 185 27 of 34

RandomForest method generally gave the best classification results. In this study, the Multilayer
Perceptron (MLP) and RandomForest provided the best classification results. MLP has not been used
in tree species classification as widely as RandomForest. A study by Feret and Asner [77