{ "cells": [ { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Lecture 2: Basic Single & Multi-Table SQL\n", "======================" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/Library/Python/2.7/site-packages/IPython/config.py:13: ShimWarning: The `IPython.config` package has been deprecated. You should import from traitlets.config instead.\n", " \"You should import from traitlets.config instead.\", ShimWarning)\n", "/Library/Python/2.7/site-packages/IPython/utils/traitlets.py:5: UserWarning: IPython.utils.traitlets has moved to a top-level traitlets package.\n", " warn(\"IPython.utils.traitlets has moved to a top-level traitlets package.\")\n" ] }, { "data": { "text/plain": [ "'Connected: None@None'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%load_ext sql\n", "%sql sqlite://" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's create a table, stuff it with data, and query it!" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql drop table if exists product;\n", "create table product(\n", " pname varchar primary key, -- name of the product\n", " price money, -- price of the product\n", " category varchar, -- category\n", " manufacturer varchar NOT NULL -- manufacturer\n", ");\n", "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n", "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's look at the products.." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
SingleTouch149.99PhotographyCanon
MultiTouch203.99HouseholdHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n", " (u'MultiTouch', 203.99, u'Household', u'Hitachi')]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql select * from product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Some *terminology* for SQL.\n", "--------------------------\n", "* The _name_ of the table is product.\n", "* Each row of the table is called a _row_ or a _tuple_. \n", "* Notice all tuples have the fields or _attributes_.\n", "* The number of rows is called the _cardinality_ while the number of attributes is called the _arity_" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Schema Conventions\n", "-----------------\n", "* The schema of product is written as follows:\n", "\n", "> product(pname, price, category, manufacturer)\n", "\n", "Underlining a set of attributes indicates that they form a _key_.\n", "\n", "* In this case, pname is a key. If the product name was only unique for a given manufacturer, we'd write:\n", "\n", "> product(pname, price, category, manufacturer)\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Tables Explained\n", "----------------\n", "* A tuple = a record\n", " * Restriction: all attributes are of atomic type\n", " * There are many atomic data types in SQL engines, look [here](http://www.postgresql.org/docs/9.4/static/datatype.html) for example.\n", "\n", "\n", "* A table = a (multi)-set of tuples\n", " * A multiset is like a list…\n", " * ... but a mutiset is unordered: \n", " * no first(), no next(), no last()." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "# Outline\n", "* Create a database -- done!\n", "\n", "* Simple querying -- now!\n", "\n", "* Queries with more than one relation -- next!\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Let the (Simple) querying begin! \n", "* We'll introduce the basics of SQL by example.\n", "* There are many good SQL tutorials on the web, this is intended to get you started." ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "> SELECT (attributes)
\n", "> FROM (one or more tables)
\n", "> WHERE (conditions)\n", "\n", "This is the simple SELECT-FROM-WHERE (SFW) block. Let's see some examples!" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
PowerGizmo29.99GadgetsGizmoWorks
" ], "text/plain": [ "[(u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * from Product \n", "WHERE category='Gadgets' and price > 20.0;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Let's give an example of *projection*, i.e., we only retain some attributes from the query. " ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
SingleTouch149.99Canon
MultiTouch203.99Hitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Canon'),\n", " (u'MultiTouch', 203.99, u'Hitachi')]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "SELECT Pname, Price, Manufacturer\n", "FROM Product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "* The output is *still* a table, and its schema is \n", "> Answer(pname, price, manufacturer)" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "* Of course, we can combine selection and projection." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'), (u'PowerGizmo', 29.99, u'GizmoWorks')]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT Pname, Price, Manufacturer\n", "FROM Product\n", "WHERE category='Gadgets';" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "The output of a query on a table is again a table \n", "----------------------------------------------\n", "* This is because the query language is *compositional*\n", "* The output of a query really is a table!\n", "* look at this crazy query, what does it ask for?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
manufacturerpnameprice
GizmoWorksPowerGizmo29.99
" ], "text/plain": [ "[(u'GizmoWorks', u'PowerGizmo', 29.99)]" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * FROM Product;\n", "\n", "SELECT\n", " p.manufacturer, p.pname, p.price\n", "FROM \n", " (SELECT distinct p0.Manufacturer\n", " FROM Product p0\n", " WHERE p0.price < 20.00) cp, -- this is a nested query!\n", " Product p\n", "WHERE \n", " p.manufacturer = cp.manufacturer and p.price > 20.00" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Details on SQL\n", "--------------\n", "\n", "* Some elements are case insensitive (think: program):\n", " * Same: SELECT Select select\n", " * Same: Product product\n", " * Different: ‘Seattle’ ‘seattle’\n", " \n", "\n", "* Constants (single quotes)\n", " * ‘abc’ - yes\n", " * “abc” - no\n" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "LIKE\n", "====\n", "\n", "The LIKE operator is to search strings, perhaps with wildcards. Format is:\n", " \n", "> SELECT *\n", "> FROM Products\n", "> WHERE pname like '%gizmo%'\n", "\n", "* % matches any number of characters\n", "* \\_ matches one character\n", "* The like operator is case sensitive\n" ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks')]" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT * FROM product\n", "where pname LIKE '%Gizmo%'" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Eliminating Duplicates\n", "---------------------\n", "* duplicates can sometimes be unwelcome or suprising. \n", " * Recall tables are _multisets_!" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category
Gadgets
Gadgets
Photography
Household
" ], "text/plain": [ "[(u'Gadgets',), (u'Gadgets',), (u'Photography',), (u'Household',)]" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql SELECT category from product;" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
category
Gadgets
Photography
Household
" ], "text/plain": [ "[(u'Gadgets',), (u'Photography',), (u'Household',)]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql \n", "-- easy to remove duplicates, use the distinct keyword\n", "SELECT DISTINCT category from product;" ] }, { "cell_type": "markdown", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Ordering the results\n", "---------------------\n", "* Sometimes you want the results ordered, let's see some examples!\n" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
SingleTouch149.99Canon
MultiTouch203.99Hitachi
" ], "text/plain": [ "[(u'SingleTouch', 149.99, u'Canon'), (u'MultiTouch', 203.99, u'Hitachi')]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- sometimes we want to order the results.\n", "-- order by is ascending by default!\n", "SELECT pname, price, manufacturer\n", "FROM Product\n", "WHERE price > 50\n", "ORDER BY price, pname" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "slideshow": { "slide_type": "subslide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pricemanufacturer
149.99Canon
29.99GizmoWorks
19.99GizmoWorks
203.99Hitachi
" ], "text/plain": [ "[(149.99, u'Canon'),\n", " (29.99, u'GizmoWorks'),\n", " (19.99, u'GizmoWorks'),\n", " (203.99, u'Hitachi')]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- sometimes we want to order the results.\n", "-- can order like so, each component individually\n", "SELECT price, manufacturer\n", "FROM Product\n", "-- the order is \"dictionary order\" in the clause.\n", "ORDER BY manufacturer ASC, price DESC" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Joins!\n", "------\n", "Let's illustrate some more complex queries that join two tables together.\n", "\n", "* Consider a table of companies, stock price, and HQ country.\n", "> company(cname, stockprice, country)\n", " \n", "* we'll then revist products and introduce some consistency requirements " ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "drop table if exists product; -- This needs to be dropped if exists, see why further down!\n", "drop table if exists company;\n", "create table company (\n", " cname varchar primary key, -- company name uniquely identifies the company.\n", " stockprice money, -- stock price is in money \n", " country varchar); -- country is just a string\n", "insert into company values ('GizmoWorks', 25.0, 'USA');\n", "insert into company values ('Canon', 65.0, 'Japan');\n", "insert into company values ('Hitachi', 15.0, 'Japan');" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cnamestockpricecountry
GizmoWorks25USA
Canon65Japan
Hitachi15Japan
" ], "text/plain": [ "[(u'GizmoWorks', 25, u'USA'),\n", " (u'Canon', 65, u'Japan'),\n", " (u'Hitachi', 15, u'Japan')]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%sql select * from company;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Foreign Key Constraints\n", "-----------------------\n", "* Suppose that we want to create a products table\n", "\n", "> Product(pname, price, category, manufacturer)\n", "\n", "* Something is odd here: We can have manufacturers that sell products but don't occur in our company table!\n", "* To protect against, this we introduce _foreign keys_ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We say the company name in products _refers_ to cname in company. Let's do it in SQL! the key statement below is:\n", "\n", "> foreign key (manufacturer) references company(cname)\n", "\n", " * Note that cname must be a key in company! \n", " * Keys and Foreign keys come up _all_ the time. \n", " * PKs and FKs are common (others, less so)" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n", "Done.\n", "Done.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n", "1 rows affected.\n" ] }, { "data": { "text/plain": [ "[]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql drop table if exists product;\n", "pragma foreign_keys = ON; -- WARNING by default off in sqlite\n", "create table product(\n", " pname varchar primary key, -- name of the product\n", " price money, -- price of the product\n", " category varchar, -- category\n", " manufacturer varchar, -- manufacturer\n", " foreign key (manufacturer) references company(cname));\n", "\n", "insert into product values('Gizmo', 19.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('PowerGizmo', 29.99, 'Gadgets', 'GizmoWorks');\n", "insert into product values('SingleTouch', 149.99, 'Photography', 'Canon');\n", "insert into product values('MultiTouch', 203.99, 'Household', 'Hitachi');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Indeed foreign keys are a _constraint_ \n", "> What happens if we introduce a company name not in our table?\n" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(IntegrityError) UNIQUE constraint failed: product.pname u\"insert into product values('MultiTouch', 203.99, 'Household', 'Google');\" ()\n", "Rejected!\n" ] } ], "source": [ "try:\n", " %sql insert into product values('MultiTouch', 203.99, 'Household', 'Google');\n", "except Exception as e:\n", " print e\n", " print \"Rejected!\"" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricecategorymanufacturer
Gizmo19.99GadgetsGizmoWorks
PowerGizmo29.99GadgetsGizmoWorks
SingleTouch149.99PhotographyCanon
MultiTouch203.99HouseholdHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'Gadgets', u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'Gadgets', u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Photography', u'Canon'),\n", " (u'MultiTouch', 203.99, u'Household', u'Hitachi')]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- the update is rejected!\n", "select * from product;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Foreign Keys and Delete\n", "=============\n", "\n", "* What happens if we delete a company? Three options:\n", " * Disallow the delete. (default)\n", " * Remove all products (add \"`on delete cascade`\")\n", " * A third variant due to NULL\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**First option (default)- delete is disallowed**" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(IntegrityError) FOREIGN KEY constraint failed u\"delete from company where cname = 'Hitachi';\" ()\n", "Disallowed!\n" ] } ], "source": [ "try:\n", " %sql delete from company where cname = 'Hitachi';\n", "except Exception as e:\n", " print e\n", " print \"Disallowed!\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Second option: remove all products belonging to the company we delete**\n", "\n", "Try adjusting the foreign key constraint clause when you create the products table as follows:\n", "> foreign key (manufacturer) references company(cname) on delete cascade\n", "\n", "Now, when a company row is deleted, all of the products linked by the foreign key constraint will be deleted as well." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Joins!\n", "------\n", "> Product (pname, price, category, manufacturer)
\n", "> Company (cname, stockPrice, country)\n", "\n", "The query we want to answer is:\n", "\n", "> Find all products under $200 manufactured in Japan;\n", "> return their names and prices. \n", "\n", "Notice products don't have a location and manufacturers don't have price. Need info in _each_ of the tables." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnameprice
SingleTouch149.99
" ], "text/plain": [ "[(u'SingleTouch', 149.99)]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "SELECT pname, price\n", "FROM product, company\n", "where manufacturer=cname and country='Japan' and price <= 200;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's see how to write this join in a modular way." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
cname
Canon
Hitachi
" ], "text/plain": [ "[(u'Canon',), (u'Hitachi',)]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- Part 1: Set of Japanese companies.\n", "SELECT distinct cname -- do we need distinct?\n", "from company where country='Japan';" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturer
Gizmo19.99GizmoWorks
PowerGizmo29.99GizmoWorks
SingleTouch149.99Canon
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks'),\n", " (u'SingleTouch', 149.99, u'Canon')]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- Part 2: Cheap Products (under $200)\n", "select distinct pname, price, manufacturer\n", "from product\n", "where price <= 200;" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnamepricemanufacturercname
Gizmo19.99GizmoWorksCanon
Gizmo19.99GizmoWorksHitachi
PowerGizmo29.99GizmoWorksCanon
PowerGizmo29.99GizmoWorksHitachi
SingleTouch149.99CanonCanon
SingleTouch149.99CanonHitachi
" ], "text/plain": [ "[(u'Gizmo', 19.99, u'GizmoWorks', u'Canon'),\n", " (u'Gizmo', 19.99, u'GizmoWorks', u'Hitachi'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks', u'Canon'),\n", " (u'PowerGizmo', 29.99, u'GizmoWorks', u'Hitachi'),\n", " (u'SingleTouch', 149.99, u'Canon', u'Canon'),\n", " (u'SingleTouch', 149.99, u'Canon', u'Hitachi')]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- combine them with nested SFW queries... this is a cross product?\n", "SELECT * \n", "FROM \n", " (SELECT DISTINCT pname, price, manufacturer\n", " FROM product\n", " WHERE price <= 200) CheapProducts,\n", " (SELECT DISTINCT cname\n", " FROM company\n", " WHERE country='Japan') JapaneseProducts;" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
pnameprice
SingleTouch149.99
" ], "text/plain": [ "[(u'SingleTouch', 149.99)]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql\n", "-- Combine them as a join!\n", "SELECT DISTINCT pname, price\n", "FROM \n", " (SELECT DISTINCT pname, price, manufacturer\n", " FROM product\n", " WHERE price <= 200) CheapProducts,\n", " (SELECT distinct cname\n", " FROM company\n", " WHERE country='Japan') JapaneseProducts\n", "WHERE cname = manufacturer;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Takeways\n", "--------\n", "* There are potentially _many logically equivalent ways_ to write a query\n", " * This fact will be used later by the query optimizer and in homework!\n", " * On exams, write the simplest thing (break it down in parts?)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Duplicate answers from join\n", "--------------------------\n", "\n", "Note that we can get duplicate answers from a join..." ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Done.\n" ] }, { "data": { "text/html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
country
USA
USA
" ], "text/plain": [ "[(u'USA',), (u'USA',)]" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "%%sql -- duplicate answer\n", "SELECT Country\n", "FROM Product, Company\n", "WHERE Manufacturer=CName AND Category='Gadgets';" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.13" } }, "nbformat": 4, "nbformat_minor": 1 }