Bauyrzhan Aubakir, Birzhan Nurimbetov, Iliyas Tursynbek, Huseyin Atakan Varol, Vital Sign Monitoring Utilizing Eulerian Video Magnification and Thermography, EMBC, 2016.
You can download thermal video database used in this work from the ARMS Repository:
If you want to use that database, please cite our paper.
In this work, we present our proof of concept for vital signs measurement using thermal and RGB videos obtained from a smartphone. Hardware and software block diagram of our framework is shown in Figure 1. Specifically, spectrogram analysis of the EVM amplified RGB video from a high frame rate smartphone camera is utilized for heart rate estimation. Cycles of temperature change in the raw thermal image around the nasal region is counted for respiration rate estimation. Temperature measurement is achieved by averaging the temperature of the forehead region pixels of the thermal image. As a novelty, we leverage face detection and region of interest (ROI) segmentation to measure the vitals without restricting the natural motion of the patient and to increase the robustness of estimation.
Figure 1. Hardware architecture of the RGB-Thermal data acquisition setup and the block diagram of the vital sign measurement framework.
A. Experimental Setup
For monitoring heart rate, we used a phone camera with 1280x720 resolution and 240 Hz frame rate.
For monitoring respiratory rate and body temperature we used a FLIR Lepton long range infrared camera with 80x60 resolution, 8.6 Hz frame rate and 50 mK thermal sensitivity. The thermal camera is connected to a BeagleBone Black single board computer.
Both phone with a camera and thermal camera with BeagleBone Black are enclosed in a 3D printed enclosure and mounted to a tripod for maintaining a stable pose (see Figure 2). The subjects were positioned approximately 30-40 cm distant from the data acquisition setup in full-face view (see Figure 3). RGB and thermal videos were captured simultaneously in an environment with daylight and no flickering light sources.
Figure 2. Hardware components of the experimental setup.
Figure 3. Relative positioning of the data acquisition setup with respect to the experimental subject.
B. Face Detection and ROI Segmentation
We implemented face detection and tracking in order to observe the vital signs in the ROI from the obtained RGB and thermal videos (see Figure 4). For the heart rate and temperature estimation our focus lies in the forehead region. For the estimation of the respiration rate we used nasal region as ROI. In that region of the thermal video, changes of the temperature between inhalation and exhalation can be observed clearly.
The above mentioned parts of the face were detected using the MATLAB implementation of the Viola-Jones object detection framework and tracked using KLT feature tracker.
Face detection classifier for RGB images is available in MATLAB. It was used to detect a face in the RGB video for heart rate estimation. However, this classifier does not work properly with the thermal images since it was trained with visible band RGB data. For this purpose, we created a database of 47 thermal videos of different persons for training our own classifier. In order to train the face detection algorithm, positive (face present) and negative (face not present) thermal images are required. This data with specific regions of interest was used for training our classifier for thermal face detection. Two thirds of the data were used for classifier training and one third was set aside for testing. Trained classifier achieved 97.8 percent accuracy. All the faces were detected correctly with only few instances of false positives. Being able to detect faces on thermal video allowed us to extract the nasal and forehead regions for respiration rate and temperature estimation, respectively.
Figure 4. Results of the face detection and the ROI extraction procedure: RGB image with marked face and forehead ROIs (a) and thermal image with marked nasal ROI (b).
C. EVM Based Heart Rate Monitoring Using RGB Video
After performing face detection and ROI segmentation in the RGB video stream, we perform EVM procedure on the forehead of the subject. Assuming that the heart rate is between 0.6 and 3 Hz (36 and 180 beats per minute (bpm)), we amplified the video 20 times for this frequency range. After this procedure heart’s diastole (relaxation) and systole (contraction) can be observed (see Figure 5a and 5b). The amplified video was converted to hue-saturation-value color model. FFT was performed on the time series signals for each pixel in the hue channel (corresponding to 7200 samples for a 30 second video). The frequency component with the highest magnitude was recorded for each spectrogram and combined into a histogram, where the highest bar indicated the raw heart rate estimate. To obtain results with more precision, we fit probability distribution using kernel density estimation method and select its maximum as the heart rate. The histogram and the fitted kernel distribution are shown in Figure 5c.
Figure 5. EVM amplified RGB images during the diastole (a) and systole (b) parts of the heart cycle. Histogram of the maximum frequency components of each forehead region pixel from the RGB video (c).
D. Respiratory Rate Monitoring Using Thermal Video
After performing face detection on the thermal video, software extracts respiratory rate. From the detected face we extract nasal area and measure the average temperature of that area. Depending on the nasal area anatomy and temperature of the environment, philtrum temperature change could be in the range of 0.25 K to 2 K. After measuring the means, we obtain FFT of the data and show the strongest frequency as respiration rate. Thermal images of the nasal region during inhalation and exhalation are shown in Figure 6a-b. The average temperature of the nasal region for a 30 second trial is shown in Figure 6c.
Figure 6. Thermal image of the nasal area obtained during inhalation (a), and exhalation (b). Positive peaks of the philtrum temperature change are used for respiratory rate estimation (c).
Using the vital sign monitoring framework, we acquired the heart rate of 11 subjects and calculated the accuracy by comparing it to the values obtained from Sigma Sport PC 10.11 heart rate monitor. The results are given in Table I.
Table I. Vital signs obtained from the rgb and thermal video processing