Querying Web APIs

Sam Mason

Learning goals

Packages

library(httr2)
library(jsonlite) # for the prettify() function

What is a web API?

Let’s build this concept up piece by piece

An application program interface (API) is a set of tools and protocols (i.e., a piece of code often written in Java, PHP, or Python) that allows unrelated applications to communicate

Why are APIs important to us?

APIs facilitate the transfer of data from a remote server to the data scientist

  • Well-developed APIs afford the data scientist a high degree of control over the data they receive

  • Accommodate programmatic data import

  • Allows access to data “from the source”

  • Many organizations maintain free public APIs

    • IMDB, Genius, OpenStreetMap, ZipRecruiter

How do APIs work?

Web APIs are built on HTTP (hypertext transfer protocol)

  1. The client (you) sends an HTTP request
    • Information about the data you want
  2. The API translates this request and queries the database (server)
    • In many cases, translated to SQL code
  3. The API then sends the data back to the client
    • Queried data often structured as JSON file(s)

Note: JSON format

Data on the web tends to be transferred in JSON format. JSON (Javascript object notation) is a hierarchical data format, not a tabular (e.g., .csv) format.

HTTP request methods

HTTP support several methods (functions) for interacting with servers

Method Description
GET Asks for a resource (data) from the server
POST Asks to write data to the server
PUT Asks to update data on the server
DELETE Asks to delete data from the server

Our first GET request

base_url <- "https://www.gordon.edu"
request(base_url) |>
  req_dry_run()
1
Create a GET request to Gordon’s servers to render the homepage
2
Show the contents of this GET request without sending it
GET / HTTP/1.1
Host: www.gordon.edu
User-Agent: httr2/1.0.5 r-curl/5.2.3 libcurl/8.7.1
Accept: */*
Accept-Encoding: gzip
  • The first line contains a lot of important info

    • The HTTP method used (GET by default)

    • The path used the find the data on the server (not including the base URL)

    • The HTTP version used to make the request

  • All other lines are headers — think of them as metadata

    • The “Host” header reports the base URL

HTTP query strings

base_url <- "https://www.gordon.edu"
request(base_url) |>
  req_url_path_append("search.cfm") |>
  req_url_query(q = "data science") |>
  req_dry_run()
1
Create a GET request to Gordon’s servers to render the homepage
2
Append “search.cfm” (an endpoint) to the base url
3
Define query parameters
4
Show the contents of this GET request without sending it
GET /search.cfm?q=data%20science HTTP/1.1
Host: www.gordon.edu
User-Agent: httr2/1.0.5 r-curl/5.2.3 libcurl/8.7.1
Accept: */*
Accept-Encoding: gzip
  • "https://www.gordon.edu/search.cfm?q=data%20science"

  • search.cfm is a function

  • ? marks the start of the query string (the arguments to the function)

  • q= is an argument with value data%20science

    • URLs don’t accept spaces; %20 is the HTML encoding for space

Sending the request

Note: Gordon’s website is not a representative exam

The API we’re querying allows us to communicate with a server that stores all of the text, images, videos, html, and css files used by your web browser to construct Gordon’s website. We use this as a familiar example to learn about HTTP requests, but this system is not representative of the web APIs and database servers that you’ll typically be interacting with.

base_url <- "https://www.gordon.edu"
request(base_url) |>
  req_url_path_append("search.cfm") |>
  req_url_query(q = "data science") |>
  req_perform() |>
  resp_raw()
1
Performing the GET request
2
Returning the raw text/html output
HTTP/1.1 200 OK
date: Tue, 22 Oct 2024 16:32:04 GMT
content-type: text/html;charset=UTF-8
cf-ray: 8d6ae196ce003b8d-BOS
cf-cache-status: DYNAMIC
set-cookie: CFID=11427830; Expires=Wed, 23 Oct 2024 18:48:03 GMT; Path=/; HttpOnly
cf-apo-via: origin,host
content-security-policy: frame-ancestors 'self' *.gordon.edu lavidacenter.org
set-cookie: CFTOKEN=f78770d464edccb7-12B5034F-A52F-5824-894D03A9E9D98718; Expires=Wed, 23 Oct 2024 18:48:03 GMT; Path=/; HttpOnly
x-powered-by: ASP.NET
x-ua-compatible: IE=edge
vary: Accept-Encoding
speculation-rules: "/cdn-cgi/speculation"
server: cloudflare
content-encoding: gzip




<!DOCTYPE html>
<html lang="en">
<!--[if lt IE 9]>
    <html class="old-ie" lang="en">
<![endif]-->
<!--[if gte IE 9|!(IE)]>
    <html lang="en"> 
<![endif]-->
<head>
<meta name="viewport" content="width=device-width" />
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta http-equiv="Content-Language" content="EN" />



<meta name="robots" content="noindex,nofollow,noarchive">
<meta name="description" content />
<meta name="keywords" content />
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon.png?v=9-29">
<link rel="icon" type="image/png" sizes="32x32" href="/favicon-32x32.png?v=9-29">
<link rel="icon" type="image/png" sizes="16x16" href="/favicon-16x16.png?v=9-29">
<link rel="manifest" href="/site.webmanifest?v=9-29">
<link rel="mask-icon" href="/safari-pinned-tab.svg?v=9-29" color="#014983">
<link rel="shortcut icon" href="/favicon.ico?v=9-29">
<meta name="msapplication-TileColor" content="#014983">
<meta name="theme-color" content="#a7e7ff">
<title>Search Results for data science - Gordon College</title>

<script type="47db5d08b51a6a83efb977a3-text/javascript">(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':
new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],
j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src=
'https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);
})(window,document,'script','dataLayer','GTM-5F93JM2');</script>

<style type="text/css" media="all">
    @import url("/_lib/css/variables.css?r=#stampfile.lastModified()#");
</style>
<style type="text/css" media="all">
    @import url("/_lib/css/responsivestyles.css?r=#stampfile.lastModified()#");
</style>

<link rel="stylesheet" type="text/css" href="//cloud.typography.com/7763712/615144/css/fonts.css" />
<link rel="stylesheet" href="https://use.typekit.net/oef5rgu.css">
<!--[if IE]>
 <style type="text/css">
  body {word-wrap: break-word;}
 </style>
<![endif]-->
<script type="47db5d08b51a6a83efb977a3-text/javascript" src="/_lib/js/page_functions.js"></script>
<script type="47db5d08b51a6a83efb977a3-text/javascript" src="/_lib/js/openwin.js"></script>
<script type="47db5d08b51a6a83efb977a3-text/javascript" src="/_lib/js/fnPreloadImages.js"></script>
<script src="//ajax.googleapis.com/ajax/libs/jquery/3.5.1/jquery.min.js" type="47db5d08b51a6a83efb977a3-text/javascript"></script>
</head>
<body>
<noscript><iframe src="https://www.googletagmanager.com/ns.html?id=GTM-5F93JM2"
height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
<header>
<div id="logo-box">
<a href="/"><img src="/images/gc-horizontal-white-yellow.svg" alt="Gordon College logo" width="160" /></a>
</div>
<div id="top-bar">
<a href="/"><img id="logo-mobile" src="/images/gc-horizontal-white-yellow.svg" alt="Gordon College Logo" /></a>
<nav role="navigation" id="slideout-navigation">
<div id="menuToggle">

<input type="checkbox" aria-label="Open the menu" />
<img class="search-nav-btn" src="/images/home2023/search-toggle.svg" alt="search icon" />

<span></span>
<span></span>
<span></span>
<ul id="slideout-menu" role="navigation">
<li>
<div class="slideout-search-container">
<form id="search" method="get" action="/search.cfm" role="search">
<input class="slideout-search-box" type="text" aria-label="Search" name="q" placeholder="Search" onclick="if (!window.__cfRLUnblockHandlers) return false; if(this.value=='SEARCH') this.value='';" onblur="if (!window.__cfRLUnblockHandlers) return false; if(this.value=='') this.value='SEARCH'" data-cf-modified-47db5d08b51a6a83efb977a3->
<input class="slideout-search-button" name="SearchButton" type="image" src="/images/search-icon-blue.svg" alt="Search" onclick="if (!window.__cfRLUnblockHandlers) return false; if (this.form.q.value == 'SEARCH') return false" data-cf-modified-47db5d08b51a6a83efb977a3->
</form>
</div>
</li>
<li style="margin:0;padding:0;">
<p class="ug-label">UNDERGRADUATE</p>
<a class="slideout-undergrad" href="/about" style="background: linear-gradient(140deg, rgba(213,240,254,1) 0%, rgba(213,240,254,1) 15%);">ABOUT</a>
<a class="slideout-undergrad" href="/admissions" style="background: linear-gradient(140deg, rgba(213,240,254,1) 15%, rgba(213,240,254,1) 30%);">ADMISSIONS &amp; AID</a>
<a class="slideout-undergrad" href="/academics" style="background: linear-gradient(140deg, rgba(213,240,254,1) 30%, rgba(211,240,246,1) 45%);">ACADEMICS</a>
<a class="slideout-undergrad" href="/studentlife" style="background: linear-gradient(140deg, rgba(211,240,246,1) 45%, rgba(209,240,240,1) 60%);">STUDENT LIFE</a>
<a class="slideout-undergrad" href="/faith" style="background: linear-gradient(140deg, rgba(209,240,240,1) 60%, rgba(207,240,232,1) 75%);">FAITH</a>
<a class="slideout-undergrad" href="https://athletics.gordon.edu/" style="background: linear-gradient(140deg, rgba(207,240,232,1) 75%, rgba(206,240,227,1) 90%);">ATHLETICS</a>
<div class="slideout-bottom-container">
<div class="slideout-container-grad">
<a class="slideout-grad" href="/gpes">GRADUATE</a>
<a class="slideout-grad" href="/giving">GIVING</a>
<a class="slideout-grad" href="/alumni">ALUMNI</a>
<a class="slideout-grad" href="/studentlinks">STUDENTS</a>
<a class="slideout-grad" href="/jobs">JOBS</a>
</div>
<div class="slideout-container-buttons">
<a class="slideout-buttons" href="/apply" style="background-color: #00aeef">APPLY</a>
<a class="slideout-buttons" href="/admissions/visit">VISIT</a>
<a class="slideout-buttons" href="/inforequest" style="background-color: #023947">GET INFO</a>
</div>
</div>
</li>
</ul>
</div>
</nav>
<div id="alt-menu-wrapper">
<a class="topnav-apply" href="/apply">APPLY TODAY</a>
<div class="search-container">
<form id="search" method="get" action="/search.cfm" style="margin:0;" role="search">
<input type="text" aria-label="Search" name="q" class="PageSearchBox" placeholder="SEARCH" onclick="if (!window.__cfRLUnblockHandlers) return false; if(this.value=='SEARCH') this.value='';" onblur="if (!window.__cfRLUnblockHandlers) return false; if(this.value=='') this.value='SEARCH'" style="text-align:center;" data-cf-modified-47db5d08b51a6a83efb977a3->
<input class="PageSearchButton" name="SearchButton" type="image" src="/images/search-icon-white.svg" alt="Search" onclick="if (!window.__cfRLUnblockHandlers) return false; if (this.form.q.value == 'SEARCH') return false" data-cf-modified-47db5d08b51a6a83efb977a3->
</form>
</div>
<ul id="alt-menu">
<li><a href="/gpes">GRADUATE</a></li>
<li><a href="/giving">GIVING</a></li>
<li style="position:relative;margin-right: 35px;"><a class="dropdown-trigger">INFO FOR</a>
<div class="info-menu-wrapper">
<ul class="info-menu">
<li><a class="info-item" href="/studentlinks">STUDENTS</a></li>
<li><a class="info-item" href="/parents">PARENTS</a></li>
<li><a class="info-item" href="/alumni">ALUMNI</a></li>
</ul>
</div>
</li>
</ul>
</div>
</div>
<div class="primary-menu-container">
<ul id="primary-menu" style="padding:0;margin: 12px 0 10px 0;">
<li><a href="/about">ABOUT</a></li>
<li><a href="/admissions">ADMISSIONS &amp; AID</a></li>
<li><a href="/academics">ACADEMICS</a></li>
<li><a href="/studentlife">STUDENT LIFE</a></li>
<li><a href="/faith">FAITH</a></li>
<li><a href="https://athletics.gordon.edu/">ATHLETICS</a></li>
</ul>
</div>
</header>
<section class="gray-section">
<div class="content-container">
<div class="body-container body-container-2col search-results">
<p class="breadcrumb">
<a href="https://www.gordon.edu/">Home</a>
&gt; <a href="#">Search Results</a>
&gt; <a href="#">data science</a>
</p>
<h1>Search Results for data science</h1>
<script type="47db5d08b51a6a83efb977a3-text/javascript" src="/_lib/js/fnOpenSearchResult.js"></script>
<div>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/datascience');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l"><b>Data Science</b> Major | Colleges for <b>Data Science</b> | Gordon - Gordon <b>...</b></span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s">Colleges with <b>data science</b> majors provide a solid foundation for tackling complex<br> <b>data</b> challenges. Learn
more about a major in <b>data science</b>. <b>...</b> <b>Data Science</b> Major. <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/datascience</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p><blockquote class="g">
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/computer_data');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">Computer &amp; <b>Data Science</b> Programs | Visit Gordon College - Gordon <b>...</b></span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s">Explore the Computer &amp; <b>Data Science</b> degree program at Gordon, New England&#39;s<br> top Christian college. Ignite
<b>...</b> Computer &amp; <b>Data Science</b>. Harness <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/computer_data</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p></blockquote>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/datascience/faculty');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l"><b>Data Science</b> Faculty - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> <b>Data Science</b> Faculty. Russ Tuck. Professor of Computer <b>...</b> facilitate the company ...<br> more ➔.
View Profile. Sam Mason. Instructor of <b>Data Science</b>. <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/datascience/faculty</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p><blockquote class="g">
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/datascience/courses');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l"><b>Data Science</b> (BS) Courses - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s">Data Science (BS) Courses. <br><font color="#008000" size="-1">www.gordon.edu/datascience/courses</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p></blockquote>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/datascience/courses/ba');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l"><b>Data Science</b> (BA) Courses - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s">Data Science (BA) Courses. <br><font color="#008000" size="-1">www.gordon.edu/datascience/courses/ba</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/academics/computerscience');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">Computer <b>Science</b> Major | Computer <b>Science</b> Near Boston - Gordon <b>...</b></span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> Russ Tuck Professor of Computer <b>Science</b> Department of Mathematical, Computer,<br> and <b>Data Science</b>
E <a href="/cdn-cgi/l/email-protection" class="__cf_email__" data-cfemail="087a7d7b7b267c7d6b63486f677a6c6766266d6c7d">[email&#160;protected]</a> P 978 867 3754. <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/academics/computerscience</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/presidentialfellows/2025');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">Presidential Fellows: 2043-2025 - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> Ethan Keyes ‘25 Hometown: Danvers, Massachusetts Major: Business Management and<br> Finance; Minors: <b>Data Science</b>
and Economics. Serving in the Office of Finance. <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/presidentialfellows/2025</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/science/faculty');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">School of <b>Science</b> &amp; Health Faculty - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> View Profile. Russ Tuck. Professor of Computer <b>Science</b> Chair, Department of<br> Mathematical, Computer,
and <b>Data</b> Sciences. <b>...</b> Sam Mason. Instructor of <b>Data Science</b>. <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/science/faculty</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/math/outcomes');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">Math Graduates&#39; Comments - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> Chris Smith, Class of 2004. In his own words: “The field of Business Intelligence<br> (BI) represents a blend of
information technology, <b>data science</b> and business <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/math/outcomes</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
<p class="g"><font size="-2"><b></b></font> <a href="#" onClick="if (!window.__cfRLUnblockHandlers) return false; fnOpenSearchResult('/academics/chemistry/faculty');" data-cf-modified-47db5d08b51a6a83efb977a3-><span class="l">Chemistry Faculty - Gordon College</span></a>
<table cellpadding="0" cellspacing="0" border="0">
<tr>
<td class="s"><b>...</b> community outreach via green chemistry. ... more ➔. View Profile. Sam Mason.<br> Instructor of <b>Data Science</b>.
With a background in ecology <b>...</b> <br><font color="#008000" size="-1">www.gordon.edu/academics/chemistry/faculty</font><font color="#008000" size="-1"> - 101k</font></td>
</tr>
</table>
</p>
</div><center>
<div class="n">
<table border="0" cellpadding="0" width="1%" cellspacing="0">
<tr align="center" valign="top">
<td valign="bottom" nowrap="1"><font size="-1">
Result Page&nbsp;</font></td>
<td nowrap="1"></td>
<td>&nbsp;<span class="i">1</span>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=10">2</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=20">3</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=30">4</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=40">5</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=50">6</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=60">7</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=70">8</a>&nbsp;
</td>
<td>&nbsp;<a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=80">9</a>&nbsp;
</td>
<td nowrap="1">&nbsp;<span class="b"><a href="/search.cfm?access=p&amp;entqr=0&amp;output=xml_no_dtd&amp;sort=date%3AD%3AL%3Ad1&amp;entsp=a&amp;ie=UTF-8&amp;client=www_frontend&amp;q=data+science&amp;num=10&amp;ud=1&amp;site=Public_Website_Collection&amp;oe=UTF-8&amp;proxystylesheet=www_frontend&amp;ip=172.27.160.198&amp;start=10">Next</a></span></td>
</tr>
</table></div>
</center>
</div>
<div class="left-column">
<div class="left-nav-container">
<ul>
<li><a href="/admissions">Admissions</a></li>
<li><a href="/about">About Gordon College</a></li>
<li><a href="/academics">Academics</a></li>
<li><a href="/studentlife">Student Life</a></li>
<li><a href="https://www.gordon.edu/athletics">Athletics</a></li>
<li><a href="/giving">Giving</a></li>
<li><a href="/gpes">Graduate, Professional and Extended Studies</a></li>
<li><a href="/faith">Faith</a></li>
<li><a href="/parents">Parent Engagement</a></li>
<li><a href="/alumni">Alumni Engagement</a></li>
<li><a href="https://www.gordon.edu/changepassword">Change Password</a></li>
<li><a href="/enc">A Message to ENC Students and Families from President Hammond</a></li>
<li class="homelink"><a href="/"><img src="/images/homeicon.svg" alt="Home icon" />Home</a></li>
</ul>
</div>
<div class="connect-area">

<div class="fixed-admissions-links-wrapper">
<a class="fixed-admissions-link" data-featherlight="#inquiryform" href="#">INFO</a>
<a class="fixed-admissions-link" href="/admissions/visit">VISIT</a>
<a class="fixed-admissions-link" href="/apply">APPLY</a>
</div>
<h3>Connect with us</h3>
<ul class="social-list">
<li><a href="https://www.facebook.com/GordonCollege"><img src="/images/layout/facebook25.png" alt="Facebook" /></a></li>
<li><a href="https://twitter.com/gordoncollege"><img src="/images/layout/twitter25.png" alt="Twitter" /></a></li>
<li><a href="https://instagram.com/gordoncollege"><img src="/images/layout/instagram25.png" alt="Instagram" /></a></li>
<li><a href="https://www.youtube.com/user/GordonCollege"><img src="/images/layout/youtube25.png" alt="YouTube" /></a></li>
</ul>
</div>
</div>
</div>
</section>
<section class="darkgray-section footer">
<div class="content-container">
<p><a href="/"><img class="footer-shield-icon" src="/images/gc-icon-white-yellow.svg" style="width: 60px;" alt="Gordon College logo" /></a></p>
<p><strong>Gordon College, 255 Grapevine Road, Wenham, MA 01984</strong><br/>
978 867 4000 &nbsp;&nbsp;|&nbsp;&nbsp; <a href="/cdn-cgi/l/email-protection#c8a9aca5a1bbbba1a7a6bb88afa7baaca7a6e6adacbd"><span class="__cf_email__" data-cfemail="cfaeaba2a6bcbca6a0a1bc8fa8a0bdaba0a1e1aaabba">[email&#160;protected]</span></a>&nbsp;&nbsp; |&nbsp;&nbsp; <a href="/cdn-cgi/l/email-protection#3b52555d547b5c54495f5455155e5f4e"><span class="__cf_email__" data-cfemail="5d34333b321d3a322f39323373383928">[email&#160;protected]</span></a></p>
<div>
<a href="https://instagram.com/gordoncollege"><img src="/images/social/Instagram_Glyph_Gradient.png" alt="Instagram icon" height="30" width="30" style="margin-right: 10px;" loading="lazy"></a>
<a href="https://www.tiktok.com/@gordoncollege"><img src="/images/social/tiktok-color.png" alt="Tiktok icon" height="30" width="30" style="margin-right: 10px;" loading="lazy"></a>
<a href="https://www.facebook.com/GordonCollege"><img src="/images/social/facebook-circle.png" alt="Facebook icon" height="30" width="30" style="margin-right: 10px;" loading="lazy"></a>
<a href="https://www.youtube.com/user/GordonCollege"><img src="/images/social/youtube-square.png" alt="YouTube icon" height="30" width="30" style="margin-right: 10px;" loading="lazy"></a>
<a href="https://twitter.com/gordoncollege"><img src="/images/social/x.png" alt="X icon" height="30" width="30" style="margin-right:0;" loading="lazy"></a>
</div>
<img class="img-right no-mobile" style="width:17%;" src="/images/seal70.png" alt="Gordon College institutional seal" loading="lazy" />
<ul>
<li><a style="color:var(--snowday);" href="/apply">Apply Today</a></li>
<li><a style="color:var(--snowday);" href="/visit">Visit Campus</a></li>
<li><a style="color:var(--snowday);" href="/inforequest">Request Information</a></li>
<li><hr></li>
<li><a href="/contact">Contact</a></li>
<li><a href="/directions">Directions/Map</a></li>
<li><a href="https://stories.gordon.edu">The Bell: News &amp; Stories</a></li>
<li><a href="/jobs">Jobs</a></li>
<li><a href="/about">About</a></li>
<li><a href="/admissions">Admissions &amp; Aid</a></li>
<li><a href="/academics">Academics</a></li>
<li><a href="/studentlife">Student Life</a><br/></li>
<li><a href="/faith">Faith</a></li>
<li><a href="https://athletics.gordon.edu/">Athletics</a></li>
<li><a href="/gpes">Graduate Programs</a></li>
<li><a href="/giving">Giving</a></li>
<li><a href="/alumni">Alumni</a></li>
<li><a href="/parents">Parents</a></li>
<li><a href="/studentlinks">Student Links</a></li>
<li><a href="/hr">Human Resources</a></li>
<li><a href="/bennett">Bennett Center</a></li>
<li><a href="https://lavidacenter.org">La Vida</a></li>
<li><a href="javascript:nw=window.open('/webprivacy.cfm','Privacy','resizable,width=640,height=800,scrollbars=yes');nw.focus();">Website Privacy</a></li>
<li><a href="/titleix">Title IX</a></li>
</ul>
<p class="boilerplate"><em>Gordon College is New England&rsquo;s top Christian college, located on the North Shore of Boston in Wenham, MA.</em></p>
<p class="boilerplate">&copy; Copyright 2024. All rights reserved.</p>
<p style="opacity:0.02;">866-665-1780</p>
<img class="mobile-only" style="width:60%;" src="/images/seal70.png" alt="Gordon College institutional seal" loading="lazy" />
</div>
</section>

<div id="inquiryform" class="lightbox" style="color: #31342b;">
<div id="form_281d17a5-1050-4255-b12b-17693f078913">Loading...</div><script data-cfasync="false" src="/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script><script type="47db5d08b51a6a83efb977a3-text/javascript">/*<![CDATA[*/var script = document.createElement('script'); script.async = 1; script.src = 'https://apply.gordon.edu/register/?id=281d17a5-1050-4255-b12b-17693f078913&output=embed&div=form_281d17a5-1050-4255-b12b-17693f078913' + ((location.search.length > 1) ? '&' + location.search.substring(1) : ''); var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(script, s);/*]]>*/</script>
</div>
<script src="/cdn-cgi/scripts/7d0fa10a/cloudflare-static/rocket-loader.min.js" data-cf-settings="47db5d08b51a6a83efb977a3-|49" defer></script></body>
</html>
</body>
</html>

HTTP response codes

The server (database) always includes a numeric status code in its response to your request

Code Meaning
1xx Informational — your request is being processed
2xx Success — transfer of information complete
3xx Redirection — further action needed
4xx Client error — you messed up somehow
5xx Server error — I (the server) messed up somehow

The AccuWeather API

Motivating prompt

I want to know what the weather is in Wenham, MA.

  • AccuWeather exposes several different APIs that might be helpful

  • Each API will often support several endpoints (e.g., search functions or terminal directories)

  • API documentation organization varies by developer, but endpoint pages tend to include (at a minimum) the following components

    • The URL for the endpoint

    • The query parameters (arguments)

    • The response format and parameters (the data you get)

Postal code search endpoint

  • Query string parameters

    • apikey: a “password” that authorizes my use of the API

    • q: the postal code I want information on

base_url <- "http://dataservice.accuweather.com/locations/v1"
request(base_url) |>
  req_url_path_append("postalcodes/search") |> # getting to the endpoint
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV",
                q = 01984) |>
  req_perform() -> my_loc_resp

Tip: Building {httr2} requests

I can perform the request above with only two lines of code:

request('http://dataservice.accuweather.com/locations/v1/postalcodes/search?apikey=jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV&q=01984') |>
  req_perform()

This approach suffers from two major limitations: (1) it is not particularly readable, and (2) it does not lend itself nicely to programmatic tools like user-defined functions and iteration. The first limitation is self-apparent. Let’s substantiate the second with an example.

Say I’d like to wrap this code in a function that allows the user to specify any arbitrary postal code. Then I can use that function within map() to iterate through a whole a vector of postal codes. The code above, as written, makes this difficult to accomplish because the postal code information is not independent of the rest of the URL. The {httr2} package gives us the ability to decompose GET request URLs down into their components such that each piece can be manipulated independently.

# defining the function
query_locations <- function(postal_code) {
  request("http://dataservice.accuweather.com/locations/v1") |>
  req_url_path_append("postalcodes/search") |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV",
                q = postal_code) |>
  req_perform()
}

# applying to multiple postal codes
map(.x = c(01974, 04038, 30126, 11590, 33414),
    .f = query_locations)

Checking the location response

Let’s look at the HTTP response that the server sent back

my_loc_resp |> resp_raw()
HTTP/1.1 200 OK
Date: Tue, 22 Oct 2024 16:32:04 GMT
Content-Type: application/json; charset=utf-8
Transfer-Encoding: chunked
Connection: keep-alive
Content-Encoding: gzip
Vary: Accept-Encoding
Request-Context: appId=cid-v1:cc223195-bccf-4201-9cde-374567896e20
RateLimit-Limit: 50
RateLimit-Remaining: 49
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
Access-Control-Allow-Methods: GET, PUT, POST, DELETE
Access-Control-Allow-Headers: origin, x-requested-with, accept
Access-Control-Max-Age: 3628800
Cache-Control: public, max-age=86359
Expires: Wed, 23 Oct 2024 16:31:23 GMT
Akamai-GRN: 0.55793517.1729614724.6f59b0b
Accept: */*
Accept-Encoding: gzip
Host: api.accuweather.com
User-Agent: httr2/1.0.5 r-curl/5.2.3 libcurl/8.7.1
X-Forwarded-For: 50.227.112.186
X-Forwarded-Port: 80
X-Forwarded-Proto: http

[{"Version":1,"Key":"504690_PC","Type":"PostalCode","Rank":500,"LocalizedName":"Domselaar","EnglishName":"Domselaar","PrimaryPostalCode":"1984","Region":{"ID":"SAM","LocalizedName":"South America","EnglishName":"South America"},"Country":{"ID":"AR","LocalizedName":"Argentina","EnglishName":"Argentina"},"AdministrativeArea":{"ID":"B","LocalizedName":"Buenos Aires","EnglishName":"Buenos Aires","Level":1,"LocalizedType":"Province","EnglishType":"Province","CountryID":"AR"},"TimeZone":{"Code":"ART","Name":"America/Argentina/Buenos_Aires","GmtOffset":-3.0,"IsDaylightSaving":false,"NextOffsetChange":null},"GeoPosition":{"Latitude":-35.067,"Longitude":-58.3,"Elevation":{"Metric":{"Value":20.0,"Unit":"m","UnitType":5},"Imperial":{"Value":66.0,"Unit":"ft","UnitType":0}}},"IsAlias":false,"SupplementalAdminAreas":[],"DataSets":["AirQualityCurrentConditions","AirQualityForecasts","FutureRadar","MinuteCast","Radar"]},{"Version":1,"Key":"504691_PC","Type":"PostalCode","Rank":500,"LocalizedName":"Haras Rataplan","EnglishName":"Haras Rataplan","PrimaryPostalCode":"1984","Region":{"ID":"SAM","LocalizedName":"South America","EnglishName":"South America"},"Country":{"ID":"AR","LocalizedName":"Argentina","EnglishName":"Argentina"},"AdministrativeArea":{"ID":"B","LocalizedName":"Buenos Aires","EnglishName":"Buenos Aires","Level":1,"LocalizedType":"Province","EnglishType":"Province","CountryID":"AR"},"TimeZone":{"Code":"ART","Name":"America/Argentina/Buenos_Aires","GmtOffset":-3.0,"IsDaylightSaving":false,"NextOffsetChange":null},"GeoPosition":{"Latitude":-35.067,"Longitude":-58.3,"Elevation":{"Metric":{"Value":20.0,"Unit":"m","UnitType":5},"Imperial":{"Value":66.0,"Unit":"ft","UnitType":0}}},"IsAlias":false,"SupplementalAdminAreas":[],"DataSets":["AirQualityCurrentConditions","AirQualityForecasts","FutureRadar","MinuteCast","Radar"]},{"Version":1,"Key":"375860_PC","Type":"PostalCode","Rank":500,"LocalizedName":"La Tour VS","EnglishName":"La Tour VS","PrimaryPostalCode":"1984","Region":{"ID":"EUR","LocalizedName":"Europe","EnglishName":"Europe"},"Country":{"ID":"CH","LocalizedName":"Switzerland","EnglishName":"Switzerland"},"AdministrativeArea":{"ID":"VS","LocalizedName":"Valais","EnglishName":"Valais","Level":1,"LocalizedType":"Canton","EnglishType":"Canton","CountryID":"CH"},"TimeZone":{"Code":"CEST","Name":"Europe/Zurich","GmtOffset":2.0,"IsDaylightSaving":true,"NextOffsetChange":"2024-10-27T01:00:00Z"},"GeoPosition":{"Latitude":46.083,"Longitude":7.517,"Elevation":{"Metric":{"Value":1565.0,"Unit":"m","UnitType":5},"Imperial":{"Value":5136.0,"Unit":"ft","UnitType":0}}},"IsAlias":false,"SupplementalAdminAreas":[],"DataSets":["AirQualityCurrentConditions","AirQualityForecasts","Alerts","DailyPollenForecast","ForecastConfidence","FutureRadar","MinuteCast","Radar"]},{"Version":1,"Key":"375861_PC","Type":"PostalCode","Rank":500,"LocalizedName":"Les Hauderes","EnglishName":"Les Hauderes","PrimaryPostalCode":"1984","Region":{"ID":"EUR","LocalizedName":"Europe","EnglishName":"Europe"},"Country":{"ID":"CH","LocalizedName":"Switzerland","EnglishName":"Switzerland"},"AdministrativeArea":{"ID":"VS","LocalizedName":"Valais","EnglishName":"Valais","Level":1,"LocalizedType":"Canton","EnglishType":"Canton","CountryID":"CH"},"TimeZone":{"Code":"CEST","Name":"Europe/Zurich","GmtOffset":2.0,"IsDaylightSaving":true,"NextOffsetChange":"2024-10-27T01:00:00Z"},"GeoPosition":{"Latitude":46.079,"Longitude":7.476,"Elevation":{"Metric":{"Value":1776.0,"Unit":"m","UnitType":5},"Imperial":{"Value":5828.0,"Unit":"ft","UnitType":0}}},"IsAlias":false,"SupplementalAdminAreas":[],"DataSets":["AirQualityCurrentConditions","AirQualityForecasts","Alerts","DailyPollenForecast","ForecastConfidence","FutureRadar","MinuteCast","Radar"]}]
  • The resp_raw() function recreates the raw server response

  • The data we want is stored in the body of the response (after all the headers) using the JSON format

my_loc_resp |>
  resp_body_string() |>
  prettify()
1
Parse JSON in the body of the response to a string
2
Format the string using proper indentation
[
    {
        "Version": 1,
        "Key": "504690_PC",
        "Type": "PostalCode",
        "Rank": 500,
        "LocalizedName": "Domselaar",
        "EnglishName": "Domselaar",
        "PrimaryPostalCode": "1984",
        "Region": {
            "ID": "SAM",
            "LocalizedName": "South America",
            "EnglishName": "South America"
        },
        "Country": {
            "ID": "AR",
            "LocalizedName": "Argentina",
            "EnglishName": "Argentina"
        },
        "AdministrativeArea": {
            "ID": "B",
            "LocalizedName": "Buenos Aires",
            "EnglishName": "Buenos Aires",
            "Level": 1,
            "LocalizedType": "Province",
            "EnglishType": "Province",
            "CountryID": "AR"
        },
        "TimeZone": {
            "Code": "ART",
            "Name": "America/Argentina/Buenos_Aires",
            "GmtOffset": -3.0,
            "IsDaylightSaving": false,
            "NextOffsetChange": null
        },
        "GeoPosition": {
            "Latitude": -35.067,
            "Longitude": -58.3,
            "Elevation": {
                "Metric": {
                    "Value": 20.0,
                    "Unit": "m",
                    "UnitType": 5
                },
                "Imperial": {
                    "Value": 66.0,
                    "Unit": "ft",
                    "UnitType": 0
                }
            }
        },
        "IsAlias": false,
        "SupplementalAdminAreas": [

        ],
        "DataSets": [
            "AirQualityCurrentConditions",
            "AirQualityForecasts",
            "FutureRadar",
            "MinuteCast",
            "Radar"
        ]
    },
    {
        "Version": 1,
        "Key": "504691_PC",
        "Type": "PostalCode",
        "Rank": 500,
        "LocalizedName": "Haras Rataplan",
        "EnglishName": "Haras Rataplan",
        "PrimaryPostalCode": "1984",
        "Region": {
            "ID": "SAM",
            "LocalizedName": "South America",
            "EnglishName": "South America"
        },
        "Country": {
            "ID": "AR",
            "LocalizedName": "Argentina",
            "EnglishName": "Argentina"
        },
        "AdministrativeArea": {
            "ID": "B",
            "LocalizedName": "Buenos Aires",
            "EnglishName": "Buenos Aires",
            "Level": 1,
            "LocalizedType": "Province",
            "EnglishType": "Province",
            "CountryID": "AR"
        },
        "TimeZone": {
            "Code": "ART",
            "Name": "America/Argentina/Buenos_Aires",
            "GmtOffset": -3.0,
            "IsDaylightSaving": false,
            "NextOffsetChange": null
        },
        "GeoPosition": {
            "Latitude": -35.067,
            "Longitude": -58.3,
            "Elevation": {
                "Metric": {
                    "Value": 20.0,
                    "Unit": "m",
                    "UnitType": 5
                },
                "Imperial": {
                    "Value": 66.0,
                    "Unit": "ft",
                    "UnitType": 0
                }
            }
        },
        "IsAlias": false,
        "SupplementalAdminAreas": [

        ],
        "DataSets": [
            "AirQualityCurrentConditions",
            "AirQualityForecasts",
            "FutureRadar",
            "MinuteCast",
            "Radar"
        ]
    },
    {
        "Version": 1,
        "Key": "375860_PC",
        "Type": "PostalCode",
        "Rank": 500,
        "LocalizedName": "La Tour VS",
        "EnglishName": "La Tour VS",
        "PrimaryPostalCode": "1984",
        "Region": {
            "ID": "EUR",
            "LocalizedName": "Europe",
            "EnglishName": "Europe"
        },
        "Country": {
            "ID": "CH",
            "LocalizedName": "Switzerland",
            "EnglishName": "Switzerland"
        },
        "AdministrativeArea": {
            "ID": "VS",
            "LocalizedName": "Valais",
            "EnglishName": "Valais",
            "Level": 1,
            "LocalizedType": "Canton",
            "EnglishType": "Canton",
            "CountryID": "CH"
        },
        "TimeZone": {
            "Code": "CEST",
            "Name": "Europe/Zurich",
            "GmtOffset": 2.0,
            "IsDaylightSaving": true,
            "NextOffsetChange": "2024-10-27T01:00:00Z"
        },
        "GeoPosition": {
            "Latitude": 46.083,
            "Longitude": 7.517,
            "Elevation": {
                "Metric": {
                    "Value": 1565.0,
                    "Unit": "m",
                    "UnitType": 5
                },
                "Imperial": {
                    "Value": 5136.0,
                    "Unit": "ft",
                    "UnitType": 0
                }
            }
        },
        "IsAlias": false,
        "SupplementalAdminAreas": [

        ],
        "DataSets": [
            "AirQualityCurrentConditions",
            "AirQualityForecasts",
            "Alerts",
            "DailyPollenForecast",
            "ForecastConfidence",
            "FutureRadar",
            "MinuteCast",
            "Radar"
        ]
    },
    {
        "Version": 1,
        "Key": "375861_PC",
        "Type": "PostalCode",
        "Rank": 500,
        "LocalizedName": "Les Hauderes",
        "EnglishName": "Les Hauderes",
        "PrimaryPostalCode": "1984",
        "Region": {
            "ID": "EUR",
            "LocalizedName": "Europe",
            "EnglishName": "Europe"
        },
        "Country": {
            "ID": "CH",
            "LocalizedName": "Switzerland",
            "EnglishName": "Switzerland"
        },
        "AdministrativeArea": {
            "ID": "VS",
            "LocalizedName": "Valais",
            "EnglishName": "Valais",
            "Level": 1,
            "LocalizedType": "Canton",
            "EnglishType": "Canton",
            "CountryID": "CH"
        },
        "TimeZone": {
            "Code": "CEST",
            "Name": "Europe/Zurich",
            "GmtOffset": 2.0,
            "IsDaylightSaving": true,
            "NextOffsetChange": "2024-10-27T01:00:00Z"
        },
        "GeoPosition": {
            "Latitude": 46.079,
            "Longitude": 7.476,
            "Elevation": {
                "Metric": {
                    "Value": 1776.0,
                    "Unit": "m",
                    "UnitType": 5
                },
                "Imperial": {
                    "Value": 5828.0,
                    "Unit": "ft",
                    "UnitType": 0
                }
            }
        },
        "IsAlias": false,
        "SupplementalAdminAreas": [

        ],
        "DataSets": [
            "AirQualityCurrentConditions",
            "AirQualityForecasts",
            "Alerts",
            "DailyPollenForecast",
            "ForecastConfidence",
            "FutureRadar",
            "MinuteCast",
            "Radar"
        ]
    }
]
 

The JSON data format

  • A hierarchical, object-based data format

  • Conceptually similar to list objects in R

Four “scalar” values

  • Strings (")

  • Numbers

  • Booleans (true, false)

  • null

Two “vector-like” data types

  • Objects
    • Objects created with {}
  • Arrays
    • Arrays created with []
  • Object elements are defined as "key": value pairs (like Python dictionaries)

  • Arrays are unnamed collections of elements

simple_json <- '[{"name": "Bob Smith","age": 38,"female": false,"siblings": ["Larry", "Moe", "Curly"]},{"name": "Jane Doe","age": null,"female": true,"siblings": ["Thelma", "Louise"]}]'
prettify(simple_json) # to render the JSON data nicely
[
    {
        "name": "Bob Smith",
        "age": 38,
        "female": false,
        "siblings": [
            "Larry",
            "Moe",
            "Curly"
        ]
    },
    {
        "name": "Jane Doe",
        "age": null,
        "female": true,
        "siblings": [
            "Thelma",
            "Louise"
        ]
    }
]
 
  • A two element array ([])

  • Each element is an object ({}) representing a person

  • Each element contains an array of string values

  • As an R list, this data would look seomthing like

simple_json_list <- list(
  list(
    name = "Bob Smith",
    age = 38,
    female = FALSE,
    siblings = list("Larry", "Moe", "Curly")
  ),
  list(
    name = "Jane Doe",
    age = NA,
    female = TRUE,
    siblings = list("Thelma", "Louise")
  )
)

Note: JSON objects vs. arrays

In the wild, JSON objects tend to contain heterogeneous data, and JSON arrays tend to contain homogeneous data. To illustrate this idea, let’s look at a couple of the objects returned by AccuWeather API:

"TimeZone": {
    "Code": "EDT",
    "Name": "America/New_York",
    "GmtOffset": -4.0,
    "IsDaylightSaving": true,
    "NextOffsetChange": "2024-11-03T06:00:00Z"
}

This object is itself a "key": value pair in a larger object. Each element of this object describes a different attribute of the time zone. Compare this to the array associated with the "Datasets" key found at the end of the JSON data:

"DataSets": [
    "AirQualityCurrentConditions",
    "AirQualityForecasts",
    "Alerts",
    "DailyAirQualityForecast",
    "DailyPollenForecast",
    "ForecastConfidence",
    "FutureRadar",
    "MinuteCast",
    "ProximityNotification-Lightning",
    "Radar",
    "TidalForecast"
]

Here each element of the array is a string, and all elements are fundamentally the same thing: the name of an AccuWeather dataset within which there exists data associated with this postal code.

Current conditions endpoint

Motivating prompt

I want to know what the weather is in Wenham, MA.

  • From the JSON data returned by our first request, the location key is "685_PC"

  • We can now use the current conditions API to get the weather for our location

Note: URL vs. query string

Web developers and data engineers have a lot of freedom in how they design their APIs. You might have expected that the location key would be passed to the current conditions endpoint as a query parameter. Instead, the location key information is provided before the query string.

  • Query string parameters

    • apikey: a “password” that authorizes my use of the API
base_url <- "http://dataservice.accuweather.com/currentconditions/v1"
request(base_url) |>
  req_url_path_append("685_PC") |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV") |>
  req_perform() -> cond_resp

Pulling values out of JSON

  • Raw JSON is difficult to work with in R

  • We’ll convert to a list, and then use indexing to get the information we want

cond_resp |>
  resp_body_json() ->
  cond_list

str(cond_list)
List of 1
 $ :List of 10
  ..$ LocalObservationDateTime: chr "2024-10-22T12:22:00-04:00"
  ..$ EpochTime               : int 1729614120
  ..$ WeatherText             : chr "Sunny"
  ..$ WeatherIcon             : int 1
  ..$ HasPrecipitation        : logi FALSE
  ..$ PrecipitationType       : NULL
  ..$ IsDayTime               : logi TRUE
  ..$ Temperature             :List of 2
  .. ..$ Metric  :List of 3
  .. .. ..$ Value   : num 23
  .. .. ..$ Unit    : chr "C"
  .. .. ..$ UnitType: int 17
  .. ..$ Imperial:List of 3
  .. .. ..$ Value   : num 73
  .. .. ..$ Unit    : chr "F"
  .. .. ..$ UnitType: int 18
  ..$ MobileLink              : chr "http://www.accuweather.com/en/us/wenham-ma/01984/current-weather/685_pc?lang=en-us"
  ..$ Link                    : chr "http://www.accuweather.com/en/us/wenham-ma/01984/current-weather/685_pc?lang=en-us"
  • What is the current temperature in degrees F?
cond_list[[1]][["Temperature"]][["Imperial"]][["Value"]]
[1] 73

In-class exercises

Use my API key to authorize all GET requests to the AccuWeather API. API requests using {httr2} are generally built using the following template:

# creating the request object (similar to how plots start with ggplot())
request() |>
  # apending to the base URL (if needed) to get to the desired endpoint
  req_url_path_append() |>
  # defining query parameters
  req_url_query() |>
  # sending the GET request to the API
  req_perform() |> # the output here is a response object
  # converting the body of the response from JSON to an R list
  resp_body_json()
  1. What is the current temperature in your hometown? If your hometown is in another country, use the zip code for my hometown (04021)

  2. Is it raining in Portland, Oregon (97229) right now?

# exercise 1
# making the request to the location endpoint and recieving the response
base_url <- "http://dataservice.accuweather.com"
request(base_url) |>
  req_url_path_append("locations/v1/postalcodes/search") |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV",
                q = 04021) |>
  req_perform() |>
  resp_body_json() -> home_loc_list

# pulling the location key out of the list
home_loc_key <- home_loc_list[[1]][["Key"]]

# making the request to the conditions endpoint
request(base_url) |>
  req_url_path_append("currentconditions/v1") |>
  req_url_path_append(home_loc_key) |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV") |>
  req_perform() |>
  resp_body_json() ->
  home_cond_list

# pulling out the temperature value in degrees F
home_cond_list[[1]][["Temperature"]][["Imperial"]][["Value"]]

# exercise 2
request(base_url) |>
  req_url_path_append("locations/v1/postalcodes/search") |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV",
                q = "97229") |>
  req_perform() |>
  resp_body_json() ->
  po_loc_list

po_loc_key <- po_loc_list[[1]][["Key"]]

request(base_url) |>
  req_url_path_append("currentconditions/v1") |>
  req_url_path_append(po_loc_key) |>
  req_url_query(apikey = "jEc2ZqG1IE1CfApGG4gwrBY1WmpaUWgV") |>
  req_perform() |>
  resp_body_json() ->
  po_cond_resp

po_cond_resp[[1]][["HasPrecipitation"]]

The Tasty API

Motivating prompt

I’ve got a whole bunch of kale and I need some dinner inspiration!

The recipes list endpoint

  • Headers

    • X-RapidAPI-Key: a “password” that authorizes my use of the API

    • X-RapidAPI-Host: the base URL of the Tasty API

  • Query string parameters

    • from: number of recipes to ignore

    • size: number of recipes to return

    • q: the search term to filter recipes by

A note on authorization

Note: API authorization

Authorization is the process by which the API determines if you have permission to access the requested data resource.

  • A bit of an inconsistent mess, to be honest

  • Three primary authorization approaches

    • API key (e.g., AccuWeather and Tasty)

    • OAuth key (best practice)

    • Username and password (worst practice)

  • Authorization information can be sent in headers (best practice; hidden), or as query parameters (worst practice; visible in URL)

  • Documentation should clearly indicate the method required

Note: Authorization vs. authentication

During your travels in API land, you’ll likely come across the term authentication. We said that authorization is a data permissions process. In contrast, authentication is a client identification process. The API will authenticate to link your GET request to you, and then authorize you (or not) to access the data you’ve asked for.

Kale recipes

  • A new {httr2} function, req_headers() will help us pass the information we need to authorize our request
tasty_key <- "23c02732cbmsh0433330afcf2156p1c39cajsnb3c98d964491"
base_url <- "https://tasty.p.rapidapi.com"
request(base_url) |>
  req_url_path_append("recipes/list") |> # specifying the endpoint
  req_headers(`X-RapidAPI-Key` = tasty_key,
              `X-RapidAPI-Host` = "tasty.p.rapidapi.com") |>
  req_url_query(from = 0, # don't skip any recipes in list
                size = 10, # give me 10 recipes
                q = "kale") |> # related to kale
  req_perform() |>
  resp_body_json() ->
  kale_recipes

kale_recipes[["results"]][[1]][["description"]]
[1] "Looking for a healthy and delicious pasta dish? Try this Whole Wheat Pasta with Lemon Kale Chicken recipe! The pasta is mixed with flavorful lemon and garlic sautéed kale and tender chicken making it a satisfying and savory dinner."