Thursday, August 27, 2020

Calculating Spatial Data Quality

This fall I begin Special Topics in GIS as I enter the last semester for my graduate certificate in GIS.  This week's focus was on methods of calculating spatial data quality.

The first task was to calculate the horizontal and vertical data accuracy and precision based off of data acquired from a handheld GPS device.  Accuracy is how close values are to an accepted reference value.  While precision is how close values are to one another (for example, a cluster of GPS points from the same device measuring the same spot would all be precise).

We started with a collection of GPS points that were all recording the same location.  Then, to find the average of the points I found the mean latitude and longitude values of all the collected GPS points and created a new average point of all of them in its own feature class.

Next, I made a multi-ring buffer around this average point.  I calculated the distances for each buffer by finding what the values would be for where 50%, 68%, or 95% of the points are within the buffer.  I found the index for each particular percentile by taking the waypoints feature class that was spatially joined with the average point to create a new distance field and then multiple the total number of waypoints by the desired percentile to find the corresponding index value.  This index value is the distance at which that percentile of points would be inside of the buffer.  This method of creating a multi-ring buffer is to show how precise the data collection process was.


68% of the points in this data collection fell within 4.4 meters of the location of the averaged waypoint.

Another important feature of data quality is to measure the accuracy of the data.  We did this using data from an absolute reference point that was established outside of the data collection process.  The majority of this work was done in Microsoft Excel using .dbf files.  I used waypoint data and compared it against benchmark data to calculate the values that were used in the cumulative distribution function graph. 

Consulting this graph shows the likelihood that a given value will be within that distance from the reference point.  This particular GPS device only has about a 10% chance of being within a meter of the reference point for any particular reading.


The CDF shows how likely it is for a point measured using the GPS to be a certain distance from a reference point.  Knowing the accuracy of a GPS device is important since some project may suffer from poor data accuracy.


Wednesday, August 5, 2020

Damage Assement

A substantial part of work that GIS professionals do at organizations like FEMA is assessing how areas where damage after a natural disaster.  Damage assessment is useful for evaluating the hardest hit areas and the extent of reconstruction necessary.  This week's lab involved assessing the damage caused by Hurricane Sandy after it made landfall new Atlantic City, New Jersey.  Even though it only struck the area as a category 1 hurricane, not only was the area not accustomed to hurricanes, but the hurricane was also the largest ever recorded.

To assess damage, the first step was to import both pre-storm and post-storm imagery of the area.  I also added in a parcel layer to make it cleared exactly what the boundaries would be for each damage assessment.  Before I could assess the damage, I had to create attribute domains to constrain the data input values so that they could only be from a select set of options and less subject to input error.  The domain particularly useful for this assignment was a structure damage domain set to coded values from zero to four - with zero being no damage and four being completely destroyed.  Following that, I created a new feature class that I set to use the domains that I had just created.

Then came the part where I actually determined the damage.  I found it easiest to first create a point for each of the parcels and then to go through each one and set the level of damage.  I think for someone who did this work regularly it would be useful to have two high resolution monitors so they could see as much of the imagery as they could as well as the tables at the same time.  Judging how much damage there was from only satellite imagery was difficult.  It was easy to see when a building was completely destroyed, but harder to discern between "affected" and "minor damage" in a real world situation it would be ideal to have someone on the ground assess each parcels from up close.

Each of the parcels in the study area were assigned a structure damage value based off of the discernible damage that could be seen from the satellite imagery.
 
The next step was to determine how distance from the coast impacted the extent of the damage.  I created a new feature class for the coast and drew it in as a line.  I then used the multi-ring buffer tool to establish three different ranges of distance.  Next, I used the clip tool to clip out only the parcels in the selected study area.  Then I used spatial join on the clipped parcels with the structure damage data so that the parcels have the structure damage value.  Once again I used the spatial join tool on the layer created in the previous step with the multi-ring buffer.  This gives each parcel in the study area a damage value and a distance value.  Lastly, I got all the numbers by using the select by attribute tool to get all of the counts (ex query: “WHERE distance is equal to 300 AND structure damage is equal to 2”).

The results of the damage assessment for each of the buffer zones.

As is to be expected, areas closer to the coast were generally hit harder by the storm.  This is a fairly small sample size and I would resist the temptation to extrapolate the same data throughout the entire area.  However, this process could be repeated throughout other sections and the results could be used to help form a more complete picture.


Saturday, August 1, 2020

Coastal Flooding

The primary focus of the lab was using LiDAR or USGS DEM data to predict areas that would be impacted by an incoming storm surge. This week's lab was a bit more of a challenge for me than I typically experience.  I had some troubles getting the tools to run properly, or they had extremely long run times.  Ultimately, I did produce the final products though.

Our first task was to use LiDAR data from before Hurricane Sandy hit and after.  This comparison shows where the buildings were destroyed or where debris piled up based off of the change in height between the two layers.  You can do additional analysis by data in current data with the existing buildings to see where structures have been rebuilt or replaced.

The height changes that can be seen with LiDAR shows that worst of the destruction was right along the coast.

For the second map we had to create we used data for Collier County in Florida.  We compared the USGS DEM to the LiDAR DEM and then saw which buildings would be impacted by a one meter high storm surge.  The LiDAR data is considered to be more accurate based off of the high accuracy of LiDAR itself.  When the two are compared against one another it is even more obvious that it is very challenging to accurately predict exactly where a storm will hit the hardest.
The USGS DEM differs somewhat from the LiDAR layer.  There are even more factors than a storm surge area alone can completely account for.