Vectorize maps into data; polygon area calculations for QGIS
This is an example project modeling step-by-step instructions for generating geospatial data for analysis from historic maps. In June, we shared out results from partnering with Joyce Chaplin’s course, Re-Wilding Harvard, where students were tasked with creating projects to investigate histories and changing natures of open space around Harvard’s campus.
Hannah Adler ‘25 chose to study the Charles River region of Harvard’s campus, on the Cambridge side near Memorial Drive.
“I was interested in the Harvard Square adjacent area near the river, because today we think of the river walkways and Memorial Drive, especially when itβs closed as a really valuable open space. I also had seen an image in HOLLIS that conveys a certain set of aesthetic ideals with how people were imagining the riverside could be.
Adler, who had some GIS coursework under her belt from past semesters, visited the Harvard Map Collection, knowing she wanted to incorporate spatial analysis techniques in her project. She sought to understand:
- What did the development process look like in this area? How rapidly did this region change?
- What was Harvard’s role in developing this area?
Hannah met with GIS Librarian Belle Lipton, and discussed approaches to modeling the data to tease out these questions.
Finding map sources
Lipton recommended a tool she worked to create at the Boston Public Library, called Atlascope . This tool incorporates over 100 different atlases from Boston and Cambridge, and assembles them into a GIS viewer so that researchers can easily compare different years and layers. Not only do these maps show the exact changes Adler was investigating, but the map are hosted as GIS layers anyone can use in their own projects.
1873
1903
1916
1930
Vectorizing
After Adler selected the four temporal snapshots she wanted to study (1873, 1903, 1916, and 1930), and located GIS layers for each of the atlas years, the next step was to create polygon data for the parcels on the maps. Tracing the boundaries of each parcel and annotating those boundaries with important information allowed her to measure the scale of changes.
Attribute Information

Orange indicates parcels that are undeveloped in 1873, meaning they have no structures on them. Purple are developed parcels, or parcels with structures on them.
After creating shape data for each parcel, Adler added columns to the GIS table which tracked the following attributes:
- For each parcel, was it developed or not? This was a binary value. If the parcel had a structure on it in 1873, the
developed
field for the parcel was encoded astrue
, and if it did not have a structure, the parcel had afalse
value. - Who was the land owner in 1873? Adler created a column to encode the names of landowners, information which is found on the maps.

The maps highlighted in yellow indicate parcels owned by Harvard in 1873.
Adler repeated this process, vectorizing and encoding the same variables for each temporal snapshot she was interested in studying.
1873
Development | Harvard-ownership |
---|---|
![]() |
![]() |
1903
Development | Harvard-ownership |
---|---|
![]() |
![]() |
1916
Development | Harvard-ownership |
---|---|
![]() |
![]() |
1930
Development | Harvard-ownership |
---|---|
![]() |
![]() |
Area analysis
Next, Adler was able to leverage GIS tools to generate area calculations tallying up how many square meters exist for each parcel type. Because these vectorized shapes correspond to real locations, GIS tools are able to report the amount of area each shape represents in reality. This works because the datasets are projected into specific coordinate reference systems that link the shape drawings with real places, using specific units of measurement.
Adler used the coordinate reference system Massachusetts State Plane
or EPSG: 26986
, which is the same coordinate reference system Massachusetts uses to store GIS data in MassGIS
, the state’s open geodata portal. This generates area calculations in the unit of square meters
.
Adler exported the area calculations to tabular formats (.csv
), so she could analyze them further using other software, such as R
. Her findings are presented in the following table.
While the total area developed (parcels with structures on them) increases only a small amount between 1873 and 1930, the area owned by Harvard increases dramatically during this period. Adler reported that there were many other variables included in the historic atlases one could choose to encode and then apply this same methodology to, whereby one calculates area based on a subsetted attribute filter of the traced parcels.
Try out the project data by downloading it from the Harvard Geospatial Library (HGL) .
- Select
https://doi.org/10.17605/OSF.IO/C9GV3
. Files
across the top menu bar.Download this folder
.
How to use this approach (a step-by-step guide)
Set a project coordinate reference system
- Download QGIS .
- Create a new QGIS project.
- Add a basemap to the project by going to the
Browser
panel, expandingXYZ Tiles
and double-clickingOpenStreetMap
.
If you do not see a browser panel, you can go to the program menu at the top of the screen and select
View
βPanels
and turn onBrowser
.
- The project coordinate reference system should now reflect that of the basemap we just added. In the bottom-right hand corner of the QGIS window, find the button that says
EPSG: 3857
. Click that button to open theProject Properties - CRS
window.
You can search for coordinate reference systems by name, place, or epsg code . You will want to choose a projection that is suited for the area you are creating data for, and is measured in units such as meters or feet, if you want to do area calculations. You can find this information in the properties of each coordinate reference system.
-
Zoom in to the area of interest using the Zoom buttons.
-
If the map disappears when you move it, or looks as though it is a strange shape, it could be due to the software attempting to reproject the basemap to the shape of a coordinate reference system built for the location you have chosen. Wait for the screen to re-load, or you can zoom to the extent of the basemap by right-clicking
OpenStreetMap
in the layer list, and choosingZoom to Layer
. Continue zooming until the area of interest is centered on the map. Any notifications about ballpark transformations you can click through or ignore.
Import georeferenced maps
Add the georeferenced map you’d like to use to trace data from. In this example, we used data from Atlascope .
If you want to use an Atlascope layer
- Toggle on the layer you’d like to use and select
Bibliographic Information
. - Copy the link after
XYZ tiles
, e.g.:
https://s3.us-east-2.wasabisys.com/urbanatlases/39999059011864/tiles/{z}/{x}/{y}.png
- Go to
Layer
>Add Layer
>Add XYZ Tile Connection
. - Choose
New
, add aTitle
, and paste the URL into the dialog box. Be careful to ensure that there are no spaces at the beginning or end of the pasted URL, or it will not load. - Make sure the layer you just created is selected in the dropdown, and choose
OK
to add it to the map. - The layer should appear on the map. If not, make sure you are zoomed in enough.
If you want to use another georeferenced map
-
If you have a
.geoTIFF
, you should be able to drag the file directly into the QGIS document and have it show up in the correct place. -
If you do not yet have a georeferenced map you’d like to work with, please refer to other guides on this pre-requisite step. A useful place to start is the tutorial Adding a Historic Map to Felt , which discusses how to use the Harvard Map Collection catalog to find and georeference a map, and how to preview the georeferenced image in a web map.
Create New Shapefile Layer
-
Click on
Layer
>Create Layer
>New Shapefile Layer
. -
Select
Polygon
as the geometry type. -
Important! From the coordinate reference system menu in this interface, ensure you are creating the new shapefile in the coordinate reference system you selected.
-
Add necessary fields for the attributes you want to record (e.g., development status, owner name). These fields will become the column headers in the data table you will create. For every polygon you create, you will also fill out a value for each of these attributes.
These attributes are also what powers the map symbolization. In the orange and purple map, we were able to ask the software to turn every polygon with the value of
Developed=no
orange, and every polygon with the value ofDeveloped=yes
purple.
- Pay attention to field types. If you are recording categories or text data, make sure you select
Text
as the field’s data type. If you are recording a number you’d like to be able to symbolize by density, make sure to selectNumber
as the field type.
Step 4: Draw Polygons on the Map
- Start drawing polygons by clicking on the
Toggle Editing
button, which looks like a pencil. - Select the
Create Polygon
button. - Start drawing!
This takes a little bit of practice to get used to. To make it easier, use the snapping tool to make sure your lines and vertices match up, and don’t leave any holes between your polyons.
-
In the main QGIS menu choose
Project
thenSnapping Options
. Toggle on the magnet icon on the far left of the wizard. Turn onVertex
andSegment
. Turn onTopological Editing
andSnapping on Intersection
so that both buttons are engaged. -
You can right-click your in-progress data layer in the
Layer
panel, and chooseView Attribute Table
. This will show you each polygon as a row in the table, and you can edit the values there. -
Any changes you don’t want to lose, make sure to
Save
by continuously clicking thePencil
orToggle Editing
icon. -
To really learn the true ins and outs of everything you can do with the editing toolbar, including moving or deleting points, check out the QGIS Editing Documentation .
Calculate Area of Each Polygon
- Open the Attribute Table.
- Go to
Field Calculator
. - Create a new field named
Area
. - Use the
$area
expression to calculate the area of each polygon.
Edit Map Symbology
- Double-click the map layer to open its properties.
- Go to the
Symbology
tab. - Change symbology from
single symbol
tocategorized
. - Choose
developed
for Value and select distinct colors foryes
andno
(e.g., #cd782e and #8c5fed).
Identify Parcels Based on Variables of Interest
- Filter the attribute table to highlight specific parcels.
- For example, to identify parcels owned by Harvard College, apply the filter
owner_name = "Harvard College"
.
Export Data Table for Analysis
- Right-click on the map layer.
- Select
Export
and choose the file type (.xlsx
or.csv
) as needed for your analysis.