GQIS is the leading free, open source Geographic Information Systems (GIS) application. It is capable of sophisticated geodata processing and analysis, and also can be used to design publication-quality data-driven maps.
The possibilities are almost endless, but you don’t need to be a GIS expert to put it to effective use for both displaying geographic data in static maps, and for processing data for use in making interactive online maps.
Today we will learn the basics of displaying and processing data in QGIS, using this data.
Launch QGIS, and you should see a screen like this:
We will first import a shapefile based on data from the U.S. Geological Survey, detailing the risk of experiencing a damaging earthquake for the lower 48 states. The file is in the subfolder seismic_risk_clip
.
Select Layer>Add vector layer
or click on this icon:
At the dialog box, click Browse
and navigate to the file seismic_risk_clip.shp
. It is important that you select the file with the .shp
extension. Then click open, and the following map of the continental United States should appear, filled with a random color:
You can turn off the visibility of any layer by unchecking its box in the Layers
panel. This can be useful to see the status of layers that would otherwise be obscured.
These controls allow you to pan and zoom the display:
You can focus the display on the full extent of any layer by right-clicking it in the Layers
panel and selecting Zoom to layer
.
Notice EPSG:4326
at the bottom right. This defines the projection for the layer. Right-click on seismic_risk_clip
in the Layers
panel at left and select Properties>General
. You should see the following under Coordinate reference system
:
EPSG:4326
and WGS 84
are two names for the unprojected Equirectangular view we saw in week 8. Strictly speaking, it refers to a datum, not a projection. We will select a projection for our work later. Click Cancel
or OK
to close Properties
for this layer.
Now let’s import a basemap for this project. We will use two layers to do this, in the subfolders ne_50m_admin_0_countries
and ne_50m_lakes
. Import the shapefiles as before, then drag and drop them in the Layers
panel so that they appear in the following order:
Having imported all the layers, save your project by selecting Project>Save
from the top menu. Keep saving your projects at regular intervals.
Now we will style the basemap. Right-click on ne_50m_admin_0_countries
in the Layers
panel and select Properties>Style
. We are going to give the land a single color, so the default Single Symbol
option from the top dropdown menu is correct. Click on Simple fill
to call up the following options:
Click on Colors Fill
to call up the Select Color
picker. There are a range of options, including a color wheel and RGB sliders. For simplicity, we will use the Crayons
option and select Silver
:
Click OK, then repeat the process for Border
, giving it the color Snow
. Click OK to close the Properties
windows, accepting the changes:
We imported ne_50_m_lakes
to include the Great Lakes on the map, not to show smaller lakes, so in styling the lakes layer we need to filter out the smaller lakes. This means styling according to data values, so we need to look at the table of data attached to this shapefile.
Select ne_50_m_lakes
in the Layers
panel, right-click and select Open Attribute Table
, or click on this icon in the toolbar at top:
This table should open:
Notice that the table contains a field called scalerank
, in which the largest lakes — including the Great Lakes — have the value 0
. This is a categorical variable coded using numbers, like some of the variables from the traffic accidents data we wored with in week 3. Close the table once more.
Right-click on ne_50_m_lakes
in the Layers
panel and select Properties>Style
. Select Categorized
from the dropdown menu at top, and under Column
select scalerank
. Click the Classify
button, then select everything apart from the symbol with the value 0
:
Click the Delete
button to remove the other symbols and then double-click on the remaining symbol. At the next dialog box select Simple fill
and change the Fill
color to Iron
and the Border style
to No Pen
:
To make the background ocean color to match the large lakes, select Project>Project Properties>General
from the top menu, and change the Background color
to Iron
:
The map should now look like this:
Now we need to color the areas for the seismic_risk_clip
layer. Open its attribute table, and notice that it contains a field called ACC_VAL
. These numbers refer to the ground acceleration predicted for a large quake with a 2% chance of occurring in 50 years, expressed as a percentage of g, the acceleration of a body falling under gravity. The numbers go as high as 200, meaning a quake with ground acceleration of twice g. The details of the units are unimportant to most people, so we will just divide the data into five categories, from low
to high
risk of experiencing a damaging earthquake.
To do this, close the attribute table and call up Properties>Style
for the seismic_risk_clip
layer. Select Graduated
from the dropdown menu at top, which is the option to color data according to a continuous variable. Select 5
under Classes
, and then New color ramp...
under Color ramp
. While QGIS has many available color ramps, we will use this opportunity to call in a ColorBrewer color scheme. At the dialog box select ColorBrewer
and then Reds
, and then click OK
:
You will need to give the color ramp a name — the default Reds5
is fine. Select ACC_VAL
under Column
and then click the Classify
button to produce the following display:
Note that the Mode
dropdown menu gives various options for automatically setting the boundaries between the five classes or bins. However, in this case the choice does not matter because we are going to set these manually.
Double-click on the first symbol and select 19
for the Upper value
and click OK
. Then double-click on the Label
for this symbol and edit the text to Low
. Carry on editing the values and labels until the display looks like this:
(Note that the ACC_VAL
field contains only integer values, so we are not excluding any numbers between 19 and 20, and so on. Do take care when setting the values for data bins not to exclude any important data!)
Click on Change...
, then select Simple fill
and change the Border style
to No Pen
and click OK
. This will remove the black borders between the areas of color.
If you are likely to want to style data in the same format in the same way in future, it is a good idea to click the Save Style
button at bottom right and save as a QGIS Layer Style File
, which is a variant of XML. When loaded using the Load Style ...
button at bottom left, it will automatically apply the saved styling.
Now click OK
to close the Properties>Style
window to accept the changes to the style for the seismic_risk_clip
layer. The map should look like this:
Now is a good time to give the project a projection. As our map is showing areas of high seismic risk for the United States, an Albers Equal Area Conic projection is a good choice.
Select Project>Project Properties>CRS
(for Coordinate Reference System) from the top menu, and check Enable 'on the fly' CRS transformation
. This will convert any subsequent layers we import into the Albers projection, also.
Type Albers
into the Filter
box and select USA_Contiguous_Albers_Equal_Area_Conic
, which has the code EPSG:102003
:
Click OK
and the map should reproject. Notice how EPSG:102003
now appears at bottom right:
The data for this layer is in the file quakes_1964_2013_5+.csv
, and was created using the USGS earthquakes API we explored in week 5. It includes all earthquakes with a magnitude of 5 and greater from 1964 to 2013 inclusive, in a circle with a radius of 6,000 kilometers from the center of the continental United States.
To import a CSV or other delimited text file with points described by latitude and longitude coordinates, select Layer>Add Delimited Text Layer
from the top menu or click this icon:
Browse
to the file with the earthquake data, and ensure that the dialog box is filled in like this:
(If your file is not a CSV you will have to select the correct delimiter, and if your latitude and longitude fields have other names, you may have to select the X field
(longitude) and Y field
(latitude) manually.
When you click OK
you will be asked to select a projection, or CRS, for the data. You may be tempted to select the same Albers projection we have set for the project, but this will cause an error. QGIS will handle the conversion to that projection: Because this data is not yet projected, we should instead select WGS 84 EPSG:4326
:
Click OK
and a large number of points will be added to the map:
Now we will style these points, giving them a single color but sizing them according to the amount of shaking that quake each caused. Open the attribute table for the quakes layer and notice that it contains a field called mag
, for earthquake magnitude. This is a logarithmic scale, so that a magnitude difference of 1 corresponds to a 10-fold difference in earth movement, as recorded on a seismogram, as this calculator shows:
This means that the equation to convert magnitude into the amplitude of the shaking, as recorded on a seismogram, is: Amplitude = 10^Magnitude
.
Select Properties>Style
for the quakes layer, and accept Single Symbol
from the top dropdown menu, as we are not going to color the points according to values in the data.
Select Simple marker
, change the Fill
to Snow
and the Border
to Iron
.
Then select Marker
and adjust the transparency to 50%:
Now click the Advanced
button to the right and select Size scaled field
, making sure Scale area
is checked (this will ensure that the circles we are about to draw scale correctly, by area). Then select - expression -
and fill in the dialog box as follows:
This incorporates the formula above, and then divides by 200,000 — which I found through trial and error gives a reasonable display.
The final map should look like this:
We will export our finished map with a legend explaining the colors, so let’s change the name of that field so it displays nicely. Right-click on seismic_risk_clip
in the Layers
panel, select Rename
and call it Seismic risk
.
To export the map, select Project>New Print Composer
, give the composer an appropriate name and click OK
. In the print composer window, select the following options in the Composition
tab:
Now click the Add new map
icon:
Draw a rectangle over the page area, and the map should appear:
If you are not happy with the display, delete the added map and adjust display as appropriate using the pan and zoom controls:
Once you are statisfied with the appearance of your map in the print composer, click the Add legend
icon:
Draw a rectangle over the map where you want the legend to appear. Don’t worry about resizing it for now, because we are going to edit the entries that appear:
In the Item properties
tab, first delete the Title
amd then open Legend items
by clicking on the small triangle next to it:
Select and delete, using the red minus symbol, each of the items apart from Seismic risk
. Then close Legend items
by clicking on the small triangle.
Open Fonts
and change Font color...
to Snow
. Notice that the font type and sizes can be customized by clicking on the other buttons.
Uncheck Background
and resize the legend as necessary, by dragging the white squares:
Finally we will add some text to explain the circles. Click on the Add new label
icon, and draw a rectangle where you want the text to appear:
Under the Item properties
tab, type the text in the Main properties
box and again remove the background and adjust the font size as required.
Finally, the map can be exported in SVG and PDF vector formats by clicking these export icons:
(Note that the SVG export will not clip the map to the page exactly. However if you open the SVG in Inkscape and save as a PDF, which will clip to the page, you may get a better rendering of the map than trhough a direct PDF export).
The final map should look like this:
Save the QGIS project, and then select Project>New
to open a new project.
In addition to displaying geographic data, QGIS is a powerful tool for processing data for use in the software itself, or by other mapping applications. If you want to make online maps, for instance, as we will do in week 11, QGIS can help get your data into the right shape. The rest of this class introduces some of QGIS’s data-processing functionality, but there is much more to explore — see the further reading.
In your new project, import the ne_50m_admin_0_countries
shapefile, then, in the same way, use Add Vector Layer
to import the file gdp_pc_join.dbf
. This is the World Bank data on GDP per capita from week 6, in “wide” format, so that the same variable has several columns, one for each year. I opened this data in Libre Office Calc and then saved it as a dBASE (.dbf)
file — which as you’ve learned is the format for shapefile attribute tables. After import, this file will appear as an isolated table in the Layers
panel. Select it and open the attribute table to view the data:
Notice that null values have been coded as -99
; this is because the .dbf
format converts blank cells to zeros. We are going to join the data on the field ISO2C
. Open the attribute table for the countries shapefile and notice that it contains a matching field called iso_a2
.
Close the attribute table once more, right-click on the shapefile and select Properties>Joins
. Click the green plus sign and fill in the dialog box as follows to join the .dbf
file to the shapefile:
After completing the join, open the shapefile’s attribute table once more to confirm that the data from the .dbf
file has appeared.
(Another approach to joining data to a shapefile would be to perform the join in R or SQLite, like we did in the last query from week 4, and then save this data as the .dbf
file for the shapefile.)
Right-click on the joined shapefile, select Save As ...
and notice the Format
options include ESRI shapefile, GeoJSON and KML. You are also able to choose a projection (CRS) for the new shapefile and restrict its extent by latitude and longitude coordinates.
Save this file as GeoJSON with an appropriate name, keeping the default WGS 84
CRS.
When displaying geodata online, it is sometimes necessary to simplify boundary data to give a smaller file size, allowing faster loading in a web browser.
Select the joined shapefile, then select Vector>Geometry Tools>Simplify geometries
, and fill in the dialog box as follows, saving as a shapefile with a new name:
In practice, you will want to experiment with different values for Simplify tolerance
to give an acceptable trade-off between file size and appearance at high zoom levels.
Save the simplified file as GeoJSON, and compare the file size with the previously saved version.
Alternatively, you can also simplify geodata outside of QGIS using the mapshaper web app.
Start a new project, and import both the ne_50m_admin_0_countries
shapefile, and the seismic_risk
shapefile. This is original version of the file we used to make the seismic risk map, extending beyond the boundaries and coastline of the United States:
Open the attribute table for the countries shapefile, and select the United States:
Close the attribute table once more, and turn off the visibility of the seismic risk layer briefly to confirm that the United States is now highlighted.
Select Vector>Geoprocessing Tools>Clip
and fill in the dialog box as follows, making sure that Use only selected features
is checked for the Clip layer
:
Click OK
and a new shapefile will be created, clipped to the borders and coastline of the United States. This is how I made the shapefile we used to make the first map.
Sometimes you may need to draw your own shape to clip to, rather than using existing geodata. When drawing shapes based on city streets, this web app can be a useful tool. Select the Polygon
and KML
options from the dropdown menu and draw your shape over the basemap.
Paste the resulting code into a text file and save with the extension .kml
. You can then use this KML file as a clip layer in QGIS.
Look at the other options in the Vector>Geoprocessing Tools
menu. Their icons give a good idea of what they do; see here for a full explanation. (Intersect
is similar to Clip
, except that it includes data from both layers in the new attribute table.)
Now open a new project, and import the shapefile sf_test_addresses
. I made this shapefile from the addresses we geocoded in Open Refine in week 8, using the coordinates provided by the Bing API. I saved it in a Google Mercator
projection, also known as EPSG:900913
, used for Google and other online maps.
This is important, because we are going to create a buffer defining areas within 1,000 feet of the nearest point. For this we need a projection with units defined in distance, rather than degrees, which is the unit for the WGS 84
datum.
Creating a buffer shapefile is a task you might perform if, for example, working out which areas are off-limits for sex offenders under residency restrictions.
Select Vector>Geoprocessing Tools>Buffer(s)
and fill in the dialog box as follows:
Selecting the maximum value of 99
under Segments to approximate
ensures that the resulting shapes are as smooth as possible. Buffer distance
is set to 304.8
because the projection’s units are meters; this value gives the 1,000 feet we require. Checking Dissolve buffer results
merges overlapping buffers into the same polygon.
Click OK
, and this should be the result:
Like the packages that extend R’s functionality, QGIS has many plugins that perform specific tasks. We are going to install two useful plugins to complete our geodata processing tutorial.
From the top menu select Plugins>Manage and Install Plugins...
and search for MMQGIS
. Select the plugin, and click the Install plugin
button. Repeat the process for Points2One
. You should now have an MMQGIS
menu at top, and the following icon should have appeared in the left toolbar:
Open a new project, and import the shapefile storms
, made from the CSV file provided for the North Atlantic storm final project. It consists of points, which we need to turn into tracks for each storm, using the Points2One
plugin.
To do this, we need a field to uniquely identify each storm, so let’s open the attribute table and look at the data. Because names may reused in different years, this doesn’t exist. But we can create the field we need by combining name
and year
.
To do this, select the Toggle Editing
icon at top left, which looks like a pencil:
Then click the Open Field Calculator
icon from the top toolbar, which looks like an abacus:
Fill in the dialog box as follows:
You are hopefully now getting sufficiently familiar with code to realize that here we are “concatenating” the two fields into one, separating them with a hyphen. Note that you must select Text (string)
for Output field type
, because “concatenate” is a function that is valid for text strings only. Setting the Output field width
to 50
ensures that none of the values get truncated.
Click OK
, then click the Toggle Editing
icon again and Save
the change. Open the attribute table to confirm that the new field has been created.
Now click on the Points2One
icon, and fill in the dialog box as follows:
You may get a warning message. View the details if you wish: It will inform you that a line could not be created for some storms for which there is only a single point.
Accept the option to display on the map, which should now look like this:
Note that most of the entries in the attribute table for the new shapefile will not be valid, as they come from just one of the points used to make each line. If you were to use this file, I suggest opening the .dbf
file in LibreOffice Calc and deleting eveything apart from the name and year fields, to avoid confusion.
Open a new project and import the 2013_Q1
shapefile from the syria_violence
subfolder. This displays violent events in Syria’s civil war from the first quarter of 2013, as displayed on this interactive, and is in a Google Mercator
projection.
If you open the attribute table for this shapefile, you will see it contains more than 10,000 entries — so clearly many of the points are lying over the top of one another.
We will use the MMQGIS
plugin to create a hexagonal grid over the map, and then count the number of points in each grid cell, to get a better picture of the intensity of the conflict by location.
First use the zoom and pan controls to ensure that there is a little space around the points in the displayed area, giving a view something like this:
Select MMQGIS>Create>Create Grid Layer
and fill in the dialog box as follows:
Center X
and Middle Y
will by default be the longitude and latitutude for the center of the displayed area. Make sure to select Hexagon (polygon)
, and then set the H Spacing
; V Spacing
will adjust automatically. Because we are working with a shapefile in Google Mercator
projection, the units will be in meters, so here we have set a horizonatal spacing for the hexagons of 20 kilometers; if we were working with an unprojected file in the WGS 84
datum, the units would be in degrees.
Click OK
to create the grid layer, giving it the same Google Mercator
projection. Drag the new grid layer under the points in the Layers
panel and the map should look like this:
Now select Vector>Analysis Tools>Points in Polygon
and fill in the dialog box as follows:
This will create a new shapefile with a field PNTCNT
, giving the number of points in each cell in the grid. Again, give this a Google Mercator
projection. This shapefile can now be used to make maps like this:
Use the shapefile we created in class with World Bank data on GDP per capita to replicate this map, from week 1:
Send me the finished map as a PDF.
Some notes on how to approach this:
You will need to change to a World_Robinson EPSG:54030
projection. When you do so, you may find the map transforms to a strange series of geometric shapes. If this happens, right-click on the layer, select Properties>Rendering
and uncheck Simplify geometry
.
When styling the map, use a ColorBrewer sequential color scheme with 5 classes. Then use the Add class
button to add a class/bin for the countries with no data. Remember that these will have the value -99
, so you can then set the values and labels for each bin in the data manually. Use a neutral gray for the countries with no data.