{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Lecture 2: Basic Single & Multi-Table SQL\n",
"======================"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Library/Python/2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated. You should import from traitlets.config instead.\n",
" \"You should import from traitlets.config instead.\", ShimWarning)\n",
"/Library/Python/2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.\n",
" warn(\"IPython.utils.traitlets has moved to a top-level traitlets package.\")\n"
]
},
{
"data": {
"text/plain": [
"'Connected: None@None'"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%load_ext sql\n",
"%sql sqlite://"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Let's create a table, stuff it with data, and query it!"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql drop table if exists product;\n",
"create table product(\n",
" pname varchar primary key, -- name of the product\n",
" price money, -- price of the product\n",
" category varchar, -- category\n",
" manufacturer varchar NOT NULL -- manufacturer\n",
");\n",
"insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n",
"insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n",
"insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n",
"insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Let's look at the products.."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"
\n",
" \n",
" pname | \n",
" price | \n",
" category | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Photography | \n",
" Canon | \n",
"
\n",
" \n",
" MultiTouch | \n",
" 203.99 | \n",
" Household | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n",
" (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n",
" (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n",
" (u'MultiTouch', 203.99, u'Household', u'Hitachi')]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select * from product;"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Some *terminology* for SQL.\n",
"--------------------------\n",
"* The _name_ of the table is product.\n",
"* Each row of the table is called a _row_ or a _tuple_. \n",
"* Notice all tuples have the fields or _attributes_.\n",
"* The number of rows is called the _cardinality_ while the number of attributes is called the _arity_"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Schema Conventions\n",
"-----------------\n",
"* The schema of product is written as follows:\n",
"\n",
"> product(pname, price, category, manufacturer)\n",
"\n",
"Underlining a set of attributes indicates that they form a _key_.\n",
"\n",
"* In this case, pname is a key. If the product name was only unique for a given manufacturer, we'd write:\n",
"\n",
"> product(pname, price, category, manufacturer)\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Tables Explained\n",
"----------------\n",
"* A tuple = a record\n",
" * Restriction: all attributes are of atomic type\n",
" * There are many atomic data types in SQL engines, look [here](http://www.postgresql.org/docs/9.4/static/datatype.html) for example.\n",
"\n",
"\n",
"* A table = a (multi)-set of tuples\n",
" * A multiset is like a list…\n",
" * ... but a mutiset is unordered: \n",
" * no first(), no next(), no last()."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"# Outline\n",
"* Create a database -- done!\n",
"\n",
"* Simple querying -- now!\n",
"\n",
"* Queries with more than one relation -- next!\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Let the (Simple) querying begin! \n",
"* We'll introduce the basics of SQL by example.\n",
"* There are many good SQL tutorials on the web, this is intended to get you started."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"> SELECT (attributes)
\n",
"> FROM (one or more tables)
\n",
"> WHERE (conditions)\n",
"\n",
"This is the simple SELECT-FROM-WHERE (SFW) block. Let's see some examples!"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" category | \n",
" manufacturer | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT * from Product \n",
"WHERE category='Gadgets' and price > 20.0;"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Let's give an example of *projection*, i.e., we only retain some attributes from the query. "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Canon | \n",
"
\n",
" \n",
" MultiTouch | \n",
" 203.99 | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'GizmoWorks'),\n",
" (u'PowerGizmo', 29.99, u'GizmoWorks'),\n",
" (u'SingleTouch', 149.99, u'Canon'),\n",
" (u'MultiTouch', 203.99, u'Hitachi')]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql \n",
"SELECT Pname, Price, Manufacturer\n",
"FROM Product;"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"* The output is *still* a table, and its schema is \n",
"> Answer(pname, price, manufacturer)"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"* Of course, we can combine selection and projection."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" GizmoWorks | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'GizmoWorks'), (u'PowerGizmo', 29.99, u'GizmoWorks')]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT Pname, Price, Manufacturer\n",
"FROM Product\n",
"WHERE category='Gadgets';"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"The output of a query on a table is again a table \n",
"----------------------------------------------\n",
"* This is because the query language is *compositional*\n",
"* The output of a query really is a table!\n",
"* look at this crazy query, what does it ask for?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" manufacturer | \n",
" pname | \n",
" price | \n",
"
\n",
" \n",
" GizmoWorks | \n",
" PowerGizmo | \n",
" 29.99 | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'GizmoWorks', u'PowerGizmo', 29.99)]"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT * FROM Product;\n",
"\n",
"SELECT\n",
" p.manufacturer, p.pname, p.price\n",
"FROM \n",
" (SELECT distinct p0.Manufacturer\n",
" FROM Product p0\n",
" WHERE p0.price < 20.00) cp, -- this is a nested query!\n",
" Product p\n",
"WHERE \n",
" p.manufacturer = cp.manufacturer and p.price > 20.00"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Details on SQL\n",
"--------------\n",
"\n",
"* Some elements are case insensitive (think: program):\n",
" * Same: SELECT Select select\n",
" * Same: Product product\n",
" * Different: ‘Seattle’ ‘seattle’\n",
" \n",
"\n",
"* Constants (single quotes)\n",
" * ‘abc’ - yes\n",
" * “abc” - no\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"LIKE\n",
"====\n",
"\n",
"The LIKE operator is to search strings, perhaps with wildcards. Format is:\n",
" \n",
"> SELECT *\n",
"> FROM Products\n",
"> WHERE pname like '%gizmo%'\n",
"\n",
"* % matches any number of characters\n",
"* \\_ matches one character\n",
"* The like operator is case sensitive\n"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" category | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n",
" (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT * FROM product\n",
"where pname LIKE '%Gizmo%'"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Eliminating Duplicates\n",
"---------------------\n",
"* duplicates can sometimes be unwelcome or suprising. \n",
" * Recall tables are _multisets_!"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" category | \n",
"
\n",
" \n",
" Gadgets | \n",
"
\n",
" \n",
" Gadgets | \n",
"
\n",
" \n",
" Photography | \n",
"
\n",
" \n",
" Household | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gadgets',), (u'Gadgets',), (u'Photography',), (u'Household',)]"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql SELECT category from product;"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" category | \n",
"
\n",
" \n",
" Gadgets | \n",
"
\n",
" \n",
" Photography | \n",
"
\n",
" \n",
" Household | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gadgets',), (u'Photography',), (u'Household',)]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql \n",
"-- easy to remove duplicates, use the distinct keyword\n",
"SELECT DISTINCT category from product;"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"Ordering the results\n",
"---------------------\n",
"* Sometimes you want the results ordered, let's see some examples!\n"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" manufacturer | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Canon | \n",
"
\n",
" \n",
" MultiTouch | \n",
" 203.99 | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'SingleTouch', 149.99, u'Canon'), (u'MultiTouch', 203.99, u'Hitachi')]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"-- sometimes we want to order the results.\n",
"-- order by is ascending by default!\n",
"SELECT pname, price, manufacturer\n",
"FROM Product\n",
"WHERE price > 50\n",
"ORDER BY price, pname"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" price | \n",
" manufacturer | \n",
"
\n",
" \n",
" 149.99 | \n",
" Canon | \n",
"
\n",
" \n",
" 29.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" 19.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" 203.99 | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(149.99, u'Canon'),\n",
" (29.99, u'GizmoWorks'),\n",
" (19.99, u'GizmoWorks'),\n",
" (203.99, u'Hitachi')]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"-- sometimes we want to order the results.\n",
"-- can order like so, each component individually\n",
"SELECT price, manufacturer\n",
"FROM Product\n",
"-- the order is \"dictionary order\" in the clause.\n",
"ORDER BY manufacturer ASC, price DESC"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Joins!\n",
"------\n",
"Let's illustrate some more complex queries that join two tables together.\n",
"\n",
"* Consider a table of companies, stock price, and HQ country.\n",
"> company(cname, stockprice, country)\n",
" \n",
"* we'll then revist products and introduce some consistency requirements "
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n",
"Done.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"drop table if exists product; -- This needs to be dropped if exists, see why further down!\n",
"drop table if exists company;\n",
"create table company (\n",
" cname varchar primary key, -- company name uniquely identifies the company.\n",
" stockprice money, -- stock price is in money \n",
" country varchar); -- country is just a string\n",
"insert into company values ('GizmoWorks', 25.0, 'USA');\n",
"insert into company values ('Canon', 65.0, 'Japan');\n",
"insert into company values ('Hitachi', 15.0, 'Japan');"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" cname | \n",
" stockprice | \n",
" country | \n",
"
\n",
" \n",
" GizmoWorks | \n",
" 25 | \n",
" USA | \n",
"
\n",
" \n",
" Canon | \n",
" 65 | \n",
" Japan | \n",
"
\n",
" \n",
" Hitachi | \n",
" 15 | \n",
" Japan | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'GizmoWorks', 25, u'USA'),\n",
" (u'Canon', 65, u'Japan'),\n",
" (u'Hitachi', 15, u'Japan')]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%sql select * from company;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Foreign Key Constraints\n",
"-----------------------\n",
"* Suppose that we want to create a products table\n",
"\n",
"> Product(pname, price, category, manufacturer)\n",
"\n",
"* Something is odd here: We can have manufacturers that sell products but don't occur in our company table!\n",
"* To protect against, this we introduce _foreign keys_ "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We say the company name in products _refers_ to cname in company. Let's do it in SQL! the key statement below is:\n",
"\n",
"> foreign key (manufacturer) references company(cname)\n",
"\n",
" * Note that cname must be a key in company! \n",
" * Keys and Foreign keys come up _all_ the time. \n",
" * PKs and FKs are common (others, less so)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n",
"Done.\n",
"Done.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n",
"1 rows affected.\n"
]
},
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql drop table if exists product;\n",
"pragma foreign_keys = ON; -- WARNING by default off in sqlite\n",
"create table product(\n",
" pname varchar primary key, -- name of the product\n",
" price money, -- price of the product\n",
" category varchar, -- category\n",
" manufacturer varchar, -- manufacturer\n",
" foreign key (manufacturer) references company(cname));\n",
"\n",
"insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n",
"insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n",
"insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n",
"insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Indeed foreign keys are a _constraint_ \n",
"> What happens if we introduce a company name not in our table?\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(IntegrityError) UNIQUE constraint failed: product.pname u\"insert into product values('MultiTouch', 203.99, 'Household', 'Google');\" ()\n",
"Rejected!\n"
]
}
],
"source": [
"try:\n",
" %sql insert into product values('MultiTouch', 203.99, 'Household', 'Google');\n",
"except Exception as e:\n",
" print e\n",
" print \"Rejected!\""
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" category | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" Gadgets | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Photography | \n",
" Canon | \n",
"
\n",
" \n",
" MultiTouch | \n",
" 203.99 | \n",
" Household | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n",
" (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n",
" (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n",
" (u'MultiTouch', 203.99, u'Household', u'Hitachi')]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"-- the update is rejected!\n",
"select * from product;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Foreign Keys and Delete\n",
"=============\n",
"\n",
"* What happens if we delete a company? Three options:\n",
" * Disallow the delete. (default)\n",
" * Remove all products (add \"`on delete cascade`\")\n",
" * A third variant due to NULL\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**First option (default)- delete is disallowed**"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(IntegrityError) FOREIGN KEY constraint failed u\"delete from company where cname = 'Hitachi';\" ()\n",
"Disallowed!\n"
]
}
],
"source": [
"try:\n",
" %sql delete from company where cname = 'Hitachi';\n",
"except Exception as e:\n",
" print e\n",
" print \"Disallowed!\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Second option: remove all products belonging to the company we delete**\n",
"\n",
"Try adjusting the foreign key constraint clause when you create the products table as follows:\n",
"> foreign key (manufacturer) references company(cname) on delete cascade\n",
"\n",
"Now, when a company row is deleted, all of the products linked by the foreign key constraint will be deleted as well."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Joins!\n",
"------\n",
"> Product (pname, price, category, manufacturer)
\n",
"> Company (cname, stockPrice, country)\n",
"\n",
"The query we want to answer is:\n",
"\n",
"> Find all products under $200 manufactured in Japan;\n",
"> return their names and prices. \n",
"\n",
"Notice products don't have a location and manufacturers don't have price. Need info in _each_ of the tables."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'SingleTouch', 149.99)]"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"SELECT pname, price\n",
"FROM product, company\n",
"where manufacturer=cname and country='Japan' and price <= 200;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's see how to write this join in a modular way."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" cname | \n",
"
\n",
" \n",
" Canon | \n",
"
\n",
" \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Canon',), (u'Hitachi',)]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql -- Part 1: Set of Japanese companies.\n",
"SELECT distinct cname -- do we need distinct?\n",
"from company where country='Japan';"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" manufacturer | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" GizmoWorks | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Canon | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'GizmoWorks'),\n",
" (u'PowerGizmo', 29.99, u'GizmoWorks'),\n",
" (u'SingleTouch', 149.99, u'Canon')]"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql -- Part 2: Cheap Products (under $200)\n",
"select distinct pname, price, manufacturer\n",
"from product\n",
"where price <= 200;"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
" manufacturer | \n",
" cname | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" GizmoWorks | \n",
" Canon | \n",
"
\n",
" \n",
" Gizmo | \n",
" 19.99 | \n",
" GizmoWorks | \n",
" Hitachi | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" GizmoWorks | \n",
" Canon | \n",
"
\n",
" \n",
" PowerGizmo | \n",
" 29.99 | \n",
" GizmoWorks | \n",
" Hitachi | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Canon | \n",
" Canon | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
" Canon | \n",
" Hitachi | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'Gizmo', 19.99, u'GizmoWorks', u'Canon'),\n",
" (u'Gizmo', 19.99, u'GizmoWorks', u'Hitachi'),\n",
" (u'PowerGizmo', 29.99, u'GizmoWorks', u'Canon'),\n",
" (u'PowerGizmo', 29.99, u'GizmoWorks', u'Hitachi'),\n",
" (u'SingleTouch', 149.99, u'Canon', u'Canon'),\n",
" (u'SingleTouch', 149.99, u'Canon', u'Hitachi')]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql -- combine them with nested SFW queries... this is a cross product?\n",
"SELECT * \n",
"FROM \n",
" (SELECT DISTINCT pname, price, manufacturer\n",
" FROM product\n",
" WHERE price <= 200) CheapProducts,\n",
" (SELECT DISTINCT cname\n",
" FROM company\n",
" WHERE country='Japan') JapaneseProducts;"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" pname | \n",
" price | \n",
"
\n",
" \n",
" SingleTouch | \n",
" 149.99 | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'SingleTouch', 149.99)]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql\n",
"-- Combine them as a join!\n",
"SELECT DISTINCT pname, price\n",
"FROM \n",
" (SELECT DISTINCT pname, price, manufacturer\n",
" FROM product\n",
" WHERE price <= 200) CheapProducts,\n",
" (SELECT distinct cname\n",
" FROM company\n",
" WHERE country='Japan') JapaneseProducts\n",
"WHERE cname = manufacturer;"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Takeways\n",
"--------\n",
"* There are potentially _many logically equivalent ways_ to write a query\n",
" * This fact will be used later by the query optimizer and in homework!\n",
" * On exams, write the simplest thing (break it down in parts?)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Duplicate answers from join\n",
"--------------------------\n",
"\n",
"Note that we can get duplicate answers from a join..."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Done.\n"
]
},
{
"data": {
"text/html": [
"\n",
" \n",
" country | \n",
"
\n",
" \n",
" USA | \n",
"
\n",
" \n",
" USA | \n",
"
\n",
"
"
],
"text/plain": [
"[(u'USA',), (u'USA',)]"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"%%sql -- duplicate answer\n",
"SELECT Country\n",
"FROM Product, Company\n",
"WHERE Manufacturer=CName AND Category='Gadgets';"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 1
}