
| Home | GIS |
Ken Robinson, a maths graduate, was a long serving officer of Milton Keynes Development Corporation. He managed the Land Information Systems project on behalf of the Corporation from the acceptance of the initial feasibility study through to the wind up of Milton Keynes Development Corporation on 31 March 1992. He is now employed by the Commission for the New Towns to manage the Land Information System inherited from the Development Corporation.
As more and more Geographic Information Systems come on line the amount of data being held within systems is increasing. Most of this data has been collected for a specific purpose and as people gain experience of these systems they start to question whether additional data can be incorporated into the GIS. This paper looks at a number of data sets which we have assimilated into our land information system for Milton Keynes.
The data sets discussed within the paper have provided an up to date gazetteer, a theme containing street address data, one containing geotechnical information and one which increased our main theme coverage by 25% with minimal effort.
Many sites are becoming more experienced with their systems and with their data and in consequence more creative in thier manipulation of the data. The techniques discussed within this paper will be of little use to consultants who will already be aware of similar work flows; however I hope they may assist other users to extract additional value from their data. The system in place on our site was supplied by Intergraph (UK) and the functionality used to extract the data is part of that system. However I have tried to explain the function of the inbuilt modules so other sites can adapt our ideas to their system. Our methodology relies heavily on the use of PERL, a Practical Extraction and Reporting Language which is within the public domain and comes bundled with most versions of UNIX. PERL is also available for DOS and WINDOWS NT. I believe that many sites will benefit from the time taken to learn PERL and those skills can be transferred to other platforms as operating systems are updated.
Milton Keynes Development Corporation (MKDC) was formed under an act of parliament in 1967 to develop a new city for 250,000 people in Buckinghamshire, England. After twenty five years of successful development, on the 31 March 1992, MKDC was wound up by central government and the responsibility for the continuing development of Milton Keynes was handed to the Commission for the New Towns (CNT), a government agency. In order to expedite the transfer of responsibilities from MKDC to CNT, MKDC undertook the installation of a Land Information System which concentrated on computerising the land terrier function.
Like many sites in the UK we use OS 1:1250 base mapping - the Land-Line product - for the greatest majority of our work. If one considers the use of base mapping in a GIS most people actually reference their themes to the OS base and not to the national grid. The grid can be considered for most purposes as an abstract concept. This map set is feature coded and when we import it into our system we keep the feature coded structure. This gives much greater flexibility to display or extract given features and also stores the data according to strict protocols.
OS base mapping contains much text information both inside the sheet boundaries and around the edges. The street name information is in its correct geographical position and if one could perform a search of the base map for the street name then we have a gazetteer.
When we import OS Landline data into our system we maintain the separation of the different data classes. Each feature within OS Landline data has a unique feature number which is also used when the transfer protocol is NTF as specified by BS7567. Also where the transfer protocol is DXF each feature is identified by a layer number. For our purposes we want to consider street names which are represented as code 1000 and as layer GS8011000. Within our system we have associated or linked a database table with this feature. This table contains three columns; the identity of the map sheet, the element link number and the street name.
Now when we create the gazetteer we want the system to carry out the majority of the work for us. Ideally we want a shell script or PERL script to run overnight. (Shell scripts run on UNIX systems whereas PERL scripts run on UNIX, DOS or WINDOWS NT systems). It is also important that any script uses the modules which are already provided within the core application software and can be run from the command line. Therefore we require a shell script which can go through each base map in turn, and update the three columns in the database table with the correct information for each row using the modules which are readily available within our systems.
Within our system we run a script which loops through each base map and creates a list file of all graphic elements which have feature code 1000. It then loads a row into the database table for each element in the list file and fills the column for element link number. Again using the list file we update the street name column from the text attached to the feature and finally we update the map identity before deleting the list file and moving on to the next file. A listing of the script file using a meta language follows:
FOR FILE IN sp*.dgn RUN ListBuilder > listfile RUN BlankLoader < listfile RUN TextLoader < listfile RUN MapidLoader > listfile DELETE listfile LOOP
Now each instance of a road name on the OS base mapping has an associated row in the table of street names. If we want to find the location of a street we can perform an SQL query on the table using the road name and instruct the query to centre and highlight the instances of the street name. In order to highlight the street name the software will attach the OS base map as a reference file to the current working file.
Milton Keynes Development Corporation purchased, in 1989, an experimental data set called OSLAND. This data set was a collection of land parcels each of which contained a seed point and attached to the seed was the street address of the property. This data set, in the event, did not provide the expected benefits and it was put to one side. Some time later we realised that we had purchased a data set which, in essence, contained address point information.
We then re processed the data and extracted from it the address string and the co-ordinates of the seed point. This information was placed in a large ASCII file containing about 130,000 property seed points. A typical entry in this file looked like:
Easting Northing Address string 484341 234341 Saxon Court 502 AVEBURY BOULEVARD
Again we used a PERL script to take this data and change it into a format which is compatible with our data base. The output of this script then had the format
Easting|Northing|Building|Number|Street 484341|234341|Saxon Court|502|AVEBURY BOULEVARD
Notice the 'pipe' symbols used as separators between the fields of the record. This database table was queried overnight by the point placer process which took each row in turn and placed the character 'x' at the position of each property seed in the newly created graphics file.
Now we possess this file we can use it to ascertain all the addresses within a defined area. Say we want to know all the properties affected by a forthcoming development; then we can define the area of the development and query the address file for all those seed points within 100m of the boundary of the development.
Our department of engineering have many borehole records which are recorded on log sheets. In order to index these records they have built an index file within LOTUS 123. To make their searches easier we decided to build a graphic theme showing the location of each borehole. Therefore we obtained a file exported from LOTUS 123 containing co-ordinates for all the boreholes and trial pits which had been dug over the life of the new town development. This file had the following format:-
1 AK11 Bore Hole 484318 234881 2 MK121S Trial Hol 484529 234712 3 BB1212 484611 233981 4 aa12G BOREHOLE 484536 234915
Again after passing the file through a PERL script to convert the file into a more acceptable format we obtained the following file:-
||484318|234881|AK11|BORE| ||484529|234712|MK121S|TRIAL| ||484611|233981|BB1212|UNKNOWN| ||484536|234915|aa12G|BORE|
This file was then imported into our database using the SQL INSERT statement. Once these points were entered into the database the points can be placed in the graphics using the Point Placer module. Within our system we differentiate between bore holes, trial pits and investigations where the type of hole is unknown by using different colour symbology consequently we ran Point Placer three times. Each time it was run a different set of points were placed using SQL query sets and the colour changed from red to green to yellow.
For those people who are unfamiliar with PERL I have included this PERL script as Appendix 1. You will see that PERL although powerful relies on the use of regular expressions. Initially these require some care as to their exact syntax, however they do not take long to learn and once learned they can be used in a number of programs.
We are currently considering the benefits of scanning the borehole logs and placing them behind the identifying symbol in the graphics file. However considering the volume of data and the frequency of its use the business case to include this data will have to be strong.
At the end of 1991 MKDC transferred its rental housing stock to the Borough Council and a number of housing associations. Having used the LIS to prepare the disposal plans for the 13,000 dwellings we ended up with a series of disposal plans held within the system. Obviously with this amount of data already within the graphics we were loath to move it into the disposal theme by hand.
Each of the disposal parcels was represented by line work held in separate drawing files, approximately 80 in total, and there were no linkages to a database. As these files had been used for plotting purposes they contained the plot boarder and mask, also we knew that the polygons representing the disposals were not necessarily closed.
We decided to investigate methods of getting the data into the disposal theme using the minimum of manual intervention and running the transfer via a series of batch processes. Obviously we only wanted to transfer the boundary lines into the new file, but we also needed to place seed points and link them to the database. In consequence we designed the following work flow.
The work flow contains a number of steps expressed in the schema which follows. It should be noted that only item 1.3.2 was carried out by an operator. All the other stages were run via four shell scripts.
1.0 Clean up the input files
1.1 Duplicate files and work on copies
1.2 Delete all from file except boundary
1.3 Close all polygons
1.3.1 Run linecleaner with auto correct option
1.3.2 Correct unresolved errors
2.0 Feature code the files
2.1 Feature code file
2.1.1 Change the symbology to match feature code
2.2 Place linked seed points in graphics
2.2.1 Place seed points in file
2.2.2 Link seed points to database
2.3 Bulk update the new database entries
3.0 Label the seed points from database
4.0 Merge the files into the disposal theme
4.1 Auto line clean the disposal files
Now I have not included any of the necessary shell scripts with this paper as they are very specific, not only to the Intergraph suite of software but also to the symbology we have adopted on this site.
This whole process was completed over 4 nights and the manual operation at the beginning (item 1.3.2) took just under one man/day. This is significantly less than the time required to incorporate this data by hand and make the necessary links to the data base.
It should be pointed out however that the writing and testing of this work flow took two man/days. It is crucial that any work flow which is undertaken as batch processes is fully tested before it is committed into both the graphics and the database.
I hope that from this paper I have managed to stimulate some ideas which can be employed by other sites. We all collect vast amounts of data and often do not see the possibilities of converting a set of files into a form more suitable for use within our geographic information systems. Taking the time to learn to write shell scripts or batch files is time which in the long run will pay hansom dividends on any site. However I believe that the way forward now, is to implement PERL scripts on user systems.
I hope that I have encouraged you to look at the existing data sets within your organisations and consider ways of incorporating them into your geographic information systems via batch processes and not rely on redigitising the data.
#!/usr/bin/perl # run the perl interpreter
# then run the rest of this
# script
# Perl script to take the output from LOTUS 123
# containing Geotech data and output it formatted
# for MK LIS database with pipe separators
#
# Ken Robinson March 95
#
open(INPUT, "GEOTECH") || die "GEOTECH could not be opened";
open(OUTPUT,">GEOTECH.OUT") || die "Output could not be opened";
while(<INPUT>)
{
chop;
if (!/^[\t ]*$/) # if not a blank line
{
s/\s{11,26}/UNKNOWN/; # Label Unknown types
s/Bore.\w*/BORE/; # Standardise Borehole
s/Trial.\w*/TRIAL/; # & Trialhole Labels
@line = split(/\s+/, $_); # split the line on
# whitespace characters
print OUTPUT "||$line[2]|$line[4]|$line[5]|$line[3]\n";
}
}
close(INPUT);
close(OUTPUT);
| Copyright © 2001 Ken Robinson |