[ad_1]
To facilitate meaningful comparisons of local neighbourhood attributes for addresses and areas across diverse cities around Australia, the 2018 Australian National Liveability Study required identification of datasets with broad coverage and consistent definitions across jurisdictions. As described above, OpenStreetMap was utilised as a nationally consistent open data source for pedestrian routable roads and walking path data, as well as for a range of points of interest. Others have previously demonstrated validity of the use of OpenStreetMap for developed urban areas such as those included in this study38. However, to examine these assumptions, through the course of our study we conducted validation experiments and sensitivity analyses, including investigation of null and outlying values, ground truthing comparisons using satellite data, and systematic comparisons of OpenStreetMap derived features with those from official or commercial datasets, as described below.
Approach to measuring and aggregating residential address point exposures
An early methodological decision in the project was to measure liveability indicators for residential address locations23, rather than approximate this using population weighted centroids as had been done in earlier work56. Address point data allowed for the disaggregated and aggregated data to be used for different purposes (i.e., linkage). For example, linkage of residential address indicator data to geocoded participant locations in health and other surveys allowed study of associations with health outcomes. Further, this also allowed measures recorded for these locations within Mesh Blocks with known residential dwellings to be readily aggregated to larger scales. Mesh Blocks with dwellings capture a two-dimensional spread of the locations where people may live, and small area counts of dwellings or persons from Census data could be used when aggregating to represent the average experience of persons or dwellings with regard to specific phenomena at a range of scales, while retaining the capacity to interrogate variation57. While a population-weighted centroid aims to capture an average location representative of experiences for a broader area, the risk is that this may result in measurement for the average location where nobody lives. This can be seen in Fig. 3, which contrasts population weighted centroids for SA1 areas with address points36 overlaid by population counts7, The population weighted centroid for the SA1 in the upper right corner is located in a location without any population count at the 2016 Census, and reliance on this single point risks mis-representing the average experience in that neighbourhood. In contrast, measurement using address points in Mesh Blocks with dwellings ensures a degree of robustness when aggregating upwards. While measurements for a single point may be an outlier in terms of neighbourhood representation, the average of a suite of points will provide a fairer representation of the ‘average experience’ for persons or dwellings, particularly when weighted for in the process of aggregation.
Pedestrian network model
Evaluation of local neighbourhood walkability and access to amenities was underpinned by assumptions of valid street network data. Prior to commencement of the national analysis, in order to evaluate suitability of using OpenStreetMap data for accessibility analyses, we conducted preliminary investigations of these assumptions, which led to a refinement in our approach. We conducted a sensitivity analysis comparing results from usage of a pedestrian network derived from OpenStreetMap (June 2018) using OSMnx, with results arising from usage of an analogous pedestrian network previously derived from the 2013 Public Sector Mapping Agency (PSMA) Transport and Topography Street Network58, which excluded heavy roads and those inaccessible to pedestrians29. The OpenStreetMap-derived network was constructed using a custom pedestrian network filter based on the OSMnx ‘walk’ network type, omitting the exclusion of cycling (Function 1). This was considered desirable for modelling pedestrian accessible routes in the Australian context, where OpenStreetMap paths tagged for cycling were found to provide important connectivity traversable by pedestrians, for example across the Yarra river in Melbourne, which at the time were absent in the ‘walk’ network otherwise intended for walking behaviour.
Function 1. Custom pedestrian network function used to construct the OpenStreetMap-derived routable pedestrian network.
pedestrian = (‘[“area”!~”yes”][“highway”!~”motor|proposed|construction|abandoned|platform|raceway”][“foot”!~”no”][“service”!~”private”][“access”!~”private”]’)
Using each of these derived pedestrian network datasets, we conducted a preliminary network analysis of the distance to closest bus stop (2012 data) for residential address points with unique locations within the Melbourne ABS 2016 Greater Capital City Statistical Area (GCCSA) using road networks and destinations extending to 10 km beyond the GCCSA boundary. Restricted to the ‘Major Urban’ or ‘Other Urban’ Sections of State, there were 1,718,271 residential address points in the urban portion of Greater Melbourne. Origin-Destination matrix (OD matrix) analyses were conducted using 64-bit Python 2.7 with the ArcGIS arcpy library and Network Analyst extension with results output to an SQL database using PostgreSQL 9.6. We examined differences in the distribution of distance to reach a bus stop and the overall count of null values using each network type. A null result was interpreted as being suggestive of isolated failures to represent real world network connectivity in this urban context where access to a bus stop within a reasonable distance could be expected for most address point origins. The results using the PSMA-derived network returned 2,083 nulls (0.12%), whilst those using the preliminary OpenStreetMap derived network returned 40 (0.002%). The additional modest number of null values in the PSMA network may be partially accounted for by the difference in date of network data publication (2013, compared to 2018); most real-world network changes would be expected to occur in new developments on the urban fringe. To facilitate fair comparisons of differences in distributions, summaries were conducted only for address points with observations in common using both network sources (n = 1,716,150).
For each address point, the distance to closest bus stop calculated using the OpenStreetMap-derived network was subtracted from the results arising from use of the PSMA 2013 network, where findings were returned for both networks. The resulting differences provide an indication of similarity, as summarised in Table 8. The difference for most addresses was less than 10 m (interquartile range −3 to 8 m), while the median difference was 1 m. These differences were positively skewed, indicating that analysis using the OpenStreetMap derived network resulted in distance estimates that for most addresses were shorter than were the PSMA network used, reflective of a greater connectivity. While most differences were on average small (11 m) some were large enough to be meaningful (standard deviation of 168 m). In outlying circumstances some addresses would travel more than 500 m further using the PSMA network to reach a bus stop (99th percentile of difference), while using the preliminary OpenStreetMap-derived network the outlying scenario approached 500 m.
When conducting a post hoc comparison analysis of results using the PSMA 2013 network and the final derived OpenStreetMap pedestrian network (October 2018) with the exclusions listed in Table 8, the distance to closest bus stop was found to be 66 m closer when using the approach adopted in the study, albeit with considerably variability (standard deviation 530 m). An important caveat with this comparison is that the preliminary analysis was conducted using bus stops from 2012 while the final analysis was conducted using bus stops from 2018, with some changes to locations of bus stops between those time points. This impacts comparability because in such cases the change in distance relates not to improved representation of pedestrian routing options or restriction to valid locations, but rather to change in the representation of where bus stops are located. For a fairer representation of the impact of the exclusions employed in the final analysis, comparison of the preliminary analysis using 2012 bus stops for both networks was repeated with these additional records excluded (n = 1,711,863). Little difference was observed, with most distributional estimates remaining unchanged from the analysis with 1,716,150 records.
Length of road network by sections of state comparison
We also compared total road length by section of state classification (Major Urban, Other Urban, Bounded Locality, or Rural Balance) using the OpenStreetMap-derived pedestrian network (October 2018) and the official Victorian Vicmap roads 2018 dataset59, excluding freeways, proposed roads and boat/ferry routes. While the results were influenced by both the coverage and density of network representation, this overall comparison emphasises the strength of the OpenStreetMap-derived network for urban areas, and underscored the importance of restricting our liveability indicator analyses drawing on OpenStreetMap data to urban areas (Table 9).
Street intersection model
To evaluate street connectivity (e.g., intersections per km2), data containing representations of street networks for mapping or routing purposes required simplification of the intersections (nodes) of network segments (edges). For example, a mapped representation of a roundabout or large street intersection on OpenStreetMap may involve multiple points where lanes of traffic or other paths intersect. While this will not necessarily pose a problem for evaluating routing through the network (other than increased processing and memory demands arising from complexity), if those nodes are naively taken to represent real-world intersections, then measures of street connectivity for that location will be over-estimated. The OSMnx python module includes a function to simplify network topology, which given a parameter for tolerance distance will return the centroid of points identified within that spatial window39. For OSMnx 0.81 as used in the study, the function was clean_intersections(graph, tolerance, dead_ends = False); in more recent versions the equivalent function is consolidate_intersections(graph, tolerance, rebuild_graph = False, dead_ends = False). We conducted a sensitivity analysis to evaluate the choice of parameter across different network topologies identified in different Australian cities, for example residential neighbourhoods with roundabouts and cul-de-sacs in Perth and Canberra, and an area of Melbourne’s CBD with tight laneways and market areas (Fig. 4). Based on this analysis, we determined that to approximate the cleaning algorithm used in our previous work for the common network topologies observed in Australian cities using the October 2018 export of OpenStreetMap, a tolerance distance of 12 m was an appropriate compromise for these settings.
Modelling and evaluating access using public open space
Public open space means different things to different people, and in the context of this national study we sought a consistent definition which we applied across jurisdictions. The Victorian Planning Authority defines open space as land providing outdoor recreation, leisure and/or environmental benefits and/or visual amenity; and public open space as land which is publicly owned, accessible, has primary purpose for outdoor recreation, leisure conservation, waterways and/or heritage, and meets the definition of open space60. This definition that we attempted to approximate using data derived from OpenStreetMap may describe a broad range of public places, including parks, squares, beaches, and conservation areas. This approach is further detailed in the supplementary material usage notes located at https://github.com/carlhiggs/Australian-National-Liveability-Study-2018-datasets-supplementary-material.
To evaluate the impact of choice of public open dataset on estimates of dwelling with access to public open space, we conducted a preliminary analysis comparing estimates for percentage of dwellings having access to a public open space within 400 m meeting a series of conditions: (1) any public open space; (2) having area of 1 hectare or larger; and (3) having area of 1 hectare or larger, or any size with a sports facility. Evaluation of access to any public open space was based on a measure used by the VPA60. Evaluation of the latter two typologies was based on Standard C13 of the Victorian Planning Provisions61: ‘Local parks within 400 metres safe walking distance of at least 95 percent of all dwellings. Where not designed to include active open space, local parks should be generally 1 hectare in area and suitably dimensioned and designed to provide for their intended use and to allow easy adaptation in response to changing community preferences’. Public open space feature datasets were derived using two official Victorian open space datasets—Victorian Planning Authority (VPA) and Vicmap Features of Interest (FOI)—as well as OpenStreetMap retrieved for a 10 km expanse beyond the boundary for Greater Melbourne. The OpenStreetMap-derived datasets (preliminary versions 1 and 2, and the final one we employed in our study) were constructed using a series of tags informed by review of the VPA definitions, OpenStreetMap tagging guidelines for public open spaces, and empirical review of satellite data for the included cities in our study. Access was further evaluated using both the OpenStreetMap-derived and Vicmap-derived pedestrian networks based on 2018 data. The VPA open space dataset used for comparison analysis was created in 2016 as part of a review into Melbourne’s metropolitan open space network and included open space features pre-categorised into Public, Restricted or Private open space; while we considered this a gold standard reference for public open space data, it had coverage only for 32 municipalities of Greater Melbourne (missing Murrindindi Shire, Mitchell Shire, Macedon Ranges Shire, Moorabool Shire)60. The Vicmap FOI open space dataset was created by the Victorian Government Department of Environment, Land, Water and Planning in 2016 and last updated prior to our retrieval in 2018; it is available as part of Vicmap’s Features of Interest dataset, with state-wide coverage and planned annual update subject to available funding62. Results of this analysis are presented in Table 10.
When considering access to public open space by road network dataset, we found only very marginal differences in estimates for percentage of dwellings with access regardless of typology or open space data source, with these mostly related to fringe areas which were excluded following restriction to the metropolitan urban area. Results using the OpenStreetMap-derived public open space datasets differed by less than 1% for urban areas when using an OpenStreetMap-derived pedestrian network compared to one constructed from the official Vicmap transport dataset. This difference for urban areas due to choice of network dataset was further reduced when revising the OpenStreetMap public open space criteria for representation of public open spaces outside of the the Melbourne setting. This suggests that the use of OpenStreetMap for routing in Australian urban settings like Melbourne is valid, a finding supported by work of other researchers38 and supports generalised usage for other urban settings in our study. This preliminary analysis also re-inforced our restriction to address points in Major Urban or Other Urban sections of state.
The magnitude and direction of differences in estimates for access to public open space when using the OpenStreetMap-derived public open space datasets as compared to the VPA ‘gold standard’ varied by class of public open space. The estimates for percentage of urban dwellings with access to any public open space were approximately 13% lower using the final OpenStreetMap-derived dataset than when using the Victorian gold standard dataset. However, access to large public open space was approximately similar; and when considering access to a large public open space, or of any size with a sports facility, estimates were approximately 16% higher when using the final OpenStreetMap-derived dataset. This suggests that while the OpenStreetMap-derived dataset may not have had as comprehensive inclusion of incidental ‘pocket’ or sliver parks and other public open space types, the representation of larger, multipurpose recreational public open spaces was accurate; further, the capacity for querying provision of sporting amenities was far greater using the final OpenStreetMap-derived data.
Estimates for access also varied across iterations of revisions of the method used to derive public open space features. As noted above, revisions of the method were broadly motivated by the application to settings beyond the preliminary Melbourne test setting, where we evaluated identified public open spaces against satellite imagery. As such, our first attempt at re-creating the VPA public open space dataset for Melbourne using OpenStreetMap could be regarded as being over-fit to the Melbourne context, and as we modified the approach to tagging ensure adequate performance in terms of our empirical face validity checks in our other cities the differences from the Victoria-specific data become larger. However, by the final iteration, the important negative difference appeared to be in representation of ‘any’ public open space; while differences to large open space were minimal, capacity to identify specific sport and leisure facilities associated with parks was greatly enhanced using additional information OpenStreetMap sport and leisure-related tags.
We concluded that the broader coverage and more timely representation of open space features in OpenStreetMap meant that in addition to yielding approximately similar results for important scenarios, it was suitable for analysis of access to public open space in urban areas in the absence of other quality, consistent public open space data with national coverage for Australia. Further detail on the implementation of the derived public open space data is provided as supplementary material.
The above sensitivity analysis was focused on a typology of public open space based on a specific set of recommendations made in the Victorian context. Our analysis in the national study was broader than this, however. In the first instance, we analysed the distance to all public open spaces within 3200 metres and to the closest public open space for address points, and allowed for subsequent post hoc querying for specific typologies of relevance to policy or researchers’ interests. The Urban Liveability Index contains a sub-indicator relating to proximal access to a public open space larger than 1.5 hectares that is based upon associations with increased recreational- and overall-walking behaviours in a Melbourne-based cohort63, and consequent recommendations24. However, we also measured distance to public open space: of any size; with a public toilet within 100 metres; < = 0.4 Ha; >0.4 Ha; >0.5 Ha; >1.5 Ha; >2 Ha; >0.4 to < = 1 Ha; >1 to < = 5 Ha; >5 Ha to < = 20 Ha; >5 Ha; >20 Ha; and having a sport facility. These measures of ‘distance to closest’ are based on typologies having relevance to specific policy settings around Australia18, and can be used to derive threshold-based indicators (for example, a Boolean indicator for access within 400 metres). Further, we provide guidance in our supplementary usage notes on GitHub for researchers to define and analyse access to areas of open space using parameters of relevance to their own agenda and research settings. That is, distance, size, attributes and co-locations may be queried as per the examples provided, as required.
Evaluating access to closest supermarket
Walkable access to a supermarket is an important indicator of a healthy food environment, representing local availability of fresh food, in addition to opportunities for incidental physical activity16. In addition to being a measure in its own right, it contributes to the walkability and urban liveability indices in this study, and thus accuracy of measurement was of particular importance. However, evaluation of access to points of interest is contingent on the quality of the data used. Commercial datasets are no guarantee of quality. As a pre-cursor activity to this study, the lead author conducted a review of destination data sources with national scope in 2016, including the Macroplan supermarket data from Pitney Bowes (November 2014), identifying, among other issues that 34 records for Foodworks stores (34/435 = 0.0782 or 7.8% of stores in the dataset) were found to have Y coordinate incorrectly recorded as a linear relationship with X coordinate: Y = -(X/10). At the time these were corrected using locations determined through web-searching. For the 2017 Creating Liveable Cities report which analysed Australia’s capital cities, our research group determined that higher quality contemporary data could be retrieved using web-scraping of major supermarket chains in Australia24.
However, when scaling up to Australia’s 21 cities the authors were also aware that independent grocery chains play a major role and were not always captured in the web-scraped data. We hypothesised, then demonstrated, that when determining the distance to closest supermarket the best estimate for individual address locations could be achieved by taking the minimum of their respective estimates using the major chain scraped data (where we assumed that 2017 supermarkets persisted in 2018) and supermarkets identified using OpenStreetMap tags informed through a review of OpenStreetMap TagInfo and Australian tagging guidelines64,65 (see Supplementary Table 1). Table 11 summarises the median and interquartile range of estimates for distance to closest supermarket by city and across the data sources: web-scraped; OpenStreetMap-derived; the difference between these estimates; and the row-wise minimum of these two records for each address. The latter is the method we used for evaluating distance to closest supermarket for indicators in the study, to ensure that error for individual address locations due to incompleteness of data was minimised. Overall, the estimates using the two datasets separately were similar, with the median difference being 22 m, slightly in favour of the web-scraped data but with relatively broad interquartile ranges indicating that at least some addresses in each city were better served by accounting for access to a supermarket using the OpenStreetMap data.
The results in Table 11 could also suggest geographic variability of coverage, with access to a supermarket for addresses in regional cities of Australia’s easternmost states (Queensland and New South Wales) performing more strongly using the major chain supermarket dataset overall, while addresses in cities of Australia’s more southern states and territories (South Australia, Victoria, Tasmania and Australian Capital Territory) tending towards better performance using the OpenStreetMap dataset. A plausible explanation could be that the latter cities may contain a greater number of independent supermarkets, or chains other than those selected for in the 2017 major chain web-scraping exercise (Aldi, Coles, Foodworks, IGA and Woolworths)24, and were better captured in the OpenStreetMap-derived data. This suggests that by pooling the data the risk of misclassification error for individual addresses when considering access to a supermarket was mitigated, compared to using either of the data sources on their own.
[ad_2]
Source link