SQL基础笔记

图片 5

Codecademy中Learn SQL,
SQL: Table Transformaton和SQL: Analyzing Business
Metrics三门课程的笔记,以及补充的附加笔记。

Codecademy的课程以SQLite编写,笔记中改成了MySQL语句。

 

I.
Learn SQL

 

 

1.
Manipulation – Create, edit, delete data

 

1.4
Create 创建数据库或数据库中的表

 

CREATE TABLE celebs 
    (
    id INTEGER, 
    name TEXT, 
    age INTEGER 
    ); # 第一列id,数据类型整数;第二列name,数据类型文本;第三列age,数据类型整数

 

1.5
Insert 向表中插入行

 

INSERT INTO celebs ( id, name, age)
    VALUES ( 1, 'Alan Mathison Turing', 42); # 在celebs表最下方插入数据:id列为1,name列为Alan Mathion Turing,age列为42

 

1.6
Select 选取数据

 

SELECT 
    *
FROM
    celebs; # 显示celebs表所有数据

 

1.7
Update 修改数据

 

UPDATE celebs 
SET 
    age = 22
WHERE
    id = 1; # 将celebs表中id=1的行的age改为22

在Safe
Update模式下,WHERE指令必须指定KEY列 

 

1.8
Alert 更改表结构或数据类型

 

ALERT TABLE celebs
ADD COLUMN twitter_handle TEXT AFTER id; # 在celebs表的id列后增加twitter_handle列

 

ALERT TABLE celebs
DROP email, DROP instagram; # 同时删除celebs的email和instagram两列

 

ALERT TABLE 'test'.'data'
CHANGE COLUMN 'Mobile' 'Mobile' BLOB NULL DEFAULT NULL; # 将表test.data的Mobile列的数据类型改为BLOB,该列数据默认为NULL

 

1.9
DELETE 删除行

 

DELETE FROM celebs 
WHERE
    twitter_handle IS NULL; # 删除表celebs中twitter_handle为NULL的行

在Safe
Update模式下,WHERE指令必须指定KEY列 

 

DELETE FROM course
WHERE
    Cno IN (8,9);

 

 

2.
Queries – Retrieve data

 

2.3
Select Distinct 返回唯一不同的值

 

SELECT DISTINCT
    genre
FROM
    movies; # 查询movies表中genre列的所有不重复值

 

2.4
Where 规定选择的标准

 

SELECT 
    *
FROM
    movies
WHERE
    imdb_rating > 8; # 查询movies表中imdb_rating大于8的行

 

= equals

!= not
equals

> greater
than

< less
than

>= greater
than or equal to

<= less than
or equal to

 

2.5
Like I 在 WHERE 子句中搜索列中的指定模式

 

SELECT 
    *
FROM
    movies
WHERE
    name LIKE ' Se_en';

 

2.6
Like II

 

SELECT 
    *
FROM
    movies
WHERE
    name LIKE 'a%';

 

查询KPI表的cout_process列中包含’50%’字符的行:

SELECT 
    *
FROM
    KPI
WHERE
    count_process LIKE '%50%%';

 

NB

“”为转义符号,代表第二个%为文本字符,而非通配符

 

‘_’
substitutes any individual character

‘%’ matches
zero or more missing characters

 

 

2.7
Between 在 WHERE 子句中使用,选取介于两个值之间的数据范围

 

The BETWEEN
operator is used to filter the result set within a certain range. The
values can be numbers, text or dates.

SELECT 
    *
FROM
    movies
WHERE
    name BETWEEN 'A' AND 'J'; # 查询movies中name以A至J开头的所有行

 

NB: names
that begin with letter “A” up to but not including “J”.

不同的数据库对
BETWEEN…AND
操作符的处理方式是有差异的,有开区间、闭区间,也有半开半闭区间。

 

SELECT 
    *
FROM
    movies
WHERE
    year BETWEEN 1990 AND 2000; # 查询movies中year在1990至2000年间的行

 

NB: years
between 1990 up to and including 2000

 

2.8
And 且运算符

 

AND is an
operator that combines two conditions. Both conditions must be true for
the row to be included in the result set.

SELECT 
    *
FROM
    movies
WHERE
    year BETWEEN 1990 AND 2000
        AND genre = 'comedy'; # 查询movies中year在1990至2000间,且genre为comedy的行

 

2.9
Or 或运算符

 

OR is used to
combine more than one condition in WHERE clause. It evaluates each
condition separately and if any of the conditions are true than the row
is added to the result set. OR is an operator that filters the result
set to only include rows where either condition is true.

SELECT 
    *
FROM
    movies
WHERE
    genre = 'comedy' OR year < 1980; # 查询movies中genre为comedy,或year小于1980的行

 

2.10
Order By 对结果集进行排序

 

SELECT 
    *
FROM
    movies
ORDER BY imdb_rating DESC; # 查询movies中的行,结果以imdb_rating降序排列

 

DESC sorts the
result by a particular column in descending order (high to low or Z –
A).

ASC ascending
order (low to high or A – Z).

 

2.11
Limit 规定返回的记录的数目

 

LIMIT is a
clause that lets you specify the maximum number of rows the result set
will have.

SELECT 
    *
FROM
    movies
ORDER BY imdb_rating ASC
LIMIT 3;  # 查询movies中的行,结果以imdb_rating升序排列,仅返回前3行

 

MS SQL
Server中使用SELECT TOP 3,Oracle中使用WHERE ROWNUM <= 5(?)

 

 

3.
Aggregate Function

 

3.2
Count 返回匹配指定条件的行数

 

COUNT( ) is a
function that takes the name of a column as an argument and counts the
number of rows where the column is not NULL.

SELECT 
    COUNT(*)
FROM
    fake_apps
WHERE 
    price = 0; # 返回fake_apps中price=0的行数

 

3.3
Group By 合计函数

 

SELECT 
    price, COUNT(*)
FROM
    fake_apps
WHERE
    downloads > 2000
GROUP BY price; # 查询fake_apps表中downloads大于2000的行,将结果集根据price分组,返回price和行数

 

Here, our
aggregate function is COUNT( ) and we are passing price as an
argument(参数) to GROUP BY. SQL will count the total number of apps
for each price in the table.

It is usually
helpful to SELECT the column you pass as an argument to GROUP BY. Here
we SELECT price and COUNT(*).

 

3.4
Sum 返回数值列的总数(总额)

 

SUM is a
function that takes the name of a column as an argument and returns the
sum of all the values in that column.

SELECT 
    category, SUM(downloads)
FROM
    fake_apps
GROUP BY category;

 

3.5
Max 返回一列中的最大值(NULL 值不包括在计算中)

 

MAX( ) is a
function that takes the name of a column as an argument and returns the
largest value in that column.

SELECT 
    name, category, MAX(downloads)
FROM
    fake_apps
GROUP BY category;

 

3.6
Min 返回一列中的最小值(NULL 值不包括在计算中)

 

MIN( ) is a
function that takes the name of a column as an argument and returns the
smallest value in that column.

SELECT 
    name, category, MIN(downloads)
FROM
    fake_apps
GROUP BY category;

 

3.7
Average 返回数值列的平均值(NULL 值不包括在计算中)

 

SELECT 
    price, AVG(downloads)
FROM
    fake_apps
GROUP BY price;

 

3.8
Round 把数值字段舍入为指定的小数位数

 

ROUND( ) is a
function that takes a column name and an integer as an argument. It
rounds the values in the column to the number of decimal places
specified by the integer.

SELECT 
    price, ROUND(AVG(downloads), 2)
FROM
    fake_apps
GROUP BY price;

 

 

4.
Multiple Tables

 

4.2
Primary Key 主键

 

A primary key
serves as a unique identifier for each row or record in a given table.
The primary key is literally an “id” value for a record. We could use
this value to connect the table to other tables.

CREATE TABLE  artists
    (
    id INTEGER PRIMARY KET,
    name TEXT
    );

 

NB

By specifying
that the “id” column is the “PRIMARY KEY”, SQL make sure that:

1. None of the
values in this column are “NULL”;

2. Each value
in this column is unique.

 

A table can
not have more than one “PRIMARY KEY” column.

 

4.3
Foreign Key 外键

 

SELECT 
    *
FROM
    albums
WHERE
    artist_id = 3;

 

A foreign key is
a column that contains the primary key of another table in the database.
We use foreign keys and primary keys to connect rows in two different
tables. One table’s foreign key holds the value of another table’s
primary key. Unlike primary keys, foreign keys do not need to be unique
and can be NULL. Here, artist_id is a foreign key in the “albums”
table.

 

The relationship
between the “artists” table and the “albums” table is the “id” value of
the artists.

 

4.4
Cross Join 用于生成两张表的笛卡尔集

 

SELECT 
    albums.name, albums.year, artists.name
FROM
    albums,
    artists;

 

One way to query
multiple tables is to write a SELECT statement with multiple table names
seperated by a comma. This is also known as a “cross join”.

 

When querying
more than one table, column names need to be specified by
table_name.column_name.

 

Unfortunately,
the result of this cross join is not very useful. It combines every row
of the “artists” table with every row of the “albums” table. It would be
more useful to only combine the rows where the album was created by the
artist.

 

4.5
Inner Join 内连接:在表中存在至少一个匹配时,INNER JOIN
关键字返回行

 

SELECT 
    *
FROM
    albums
        JOIN
    artists ON albums.artist_id = artists.id; # INNER JOIN等价于JOIN,写JOIN默认为INNER JOIN

 

In SQL, joins
are used to combine rows from two or more tables. The most common type
of join in SQL is an inner join.

 

An inner join
will combine rows from different tables if the join condition is
true.

  1. SELECT *:
    specifies the columns our result set will have. Here * refers to every
    column in both tables;

  2. FROM albums:
    specifies first table we are querying;

  3. JOIN artists
    ON: specifies the type of join as well as the second table;

4.
albums.artist_id = artists.id: is the join condition that describes how
the two tables are related to each other. Here, SQL uses the foreign key
column “artist_id” in the “albums” table to match it with exactly one
row in the “artists” table with the same value in the “id” column. It
will only match one row in the “artists” table because “id” is the
PRIMARY KEY of “artists”.

 

4.6
Left Outer Join
左外连接:即使右表中没有匹配,也从左表返回所有的行

 

SELECT 
    *
FROM
    albums
        LEFT JOIN
    artists ON albums.artist_id = artists.id;

 

Outer joins also
combine rows from two or more tables, but unlike inner joins, they do
not require the join condition to be met. Instead, every row in the
left table is returned
 in the result set, and if the join condition is
not met, the NULL values are used to fill in the columns from the right
table.

 

RIGHT JOIN
右外链接:即使左表中没有匹配,也从右表返回所有的行

FULL JOIN
全链接:只要其中一个表中存在匹配,就返回行

 

4.7
Aliases 为列名称和表名称指定别名

 

AS is a keyword
in SQL that allows you to rename a column or table using an alias. The
new name can be anything you want as long as you put it inside of single
quotes.

SELECT 
    albums.name AS 'Album',
    albums.year,
    artists.name AS 'Artist'
FROM
    albums
        JOIN
    artists ON albums.artist_id = artists.id
WHERE
    albums.year > 1980;

 

NB

The columns
have not been renamed in either table. The aliases only appear in the
result set.

 

 

 

II.
SQL: Table Transformation

 

 

1.
Subqueries 子查询

 

1.2
Non-Correlated Subqueries I 不相关子查询

 

SELECT 
    *
FROM
    flights
WHERE
    origin IN (SELECT 
            code
        FROM
            airports
        WHERE
            elevation > 2000);

 

1.4
Non-Correlated Subqueries III

 

SELECT 
    a.dep_month,
    a.dep_day_of_week,
    AVG(a.flight_count) AS average_flights
FROM
    (SELECT 
        dep_month,
            dep_day_of_week,
            dep_date,
            COUNT(*) AS flight_count
    FROM
        flights
    GROUP BY 1 , 2 , 3) a
WHERE
    a.dep_day_of_week = 'Friday'
GROUP BY 1 , 2
ORDER BY 1 , 2; # 返回每个月中,每个星期五的平均航班数量

 

结构

[outer
query]

    FROM

    [inner
query] a

WHERE

GROUP BY

ORDER BY

 

NB

“a”: With the
inner query, we create a virtual table. In the outer query, we can refer
to the inner query as “a”.

“1,2,3” in
inner query: refer to the first, second and third columns
selected

         for
display                      DBMS

SELECT
dep_month,                 (1)

dep_day_of_week,
                    (2)

dep_date,    
                               (3)

COUNT(*) AS
flight_count         (4)

FROM
flights

 

SELECT 
    a.dep_month,
    a.dep_day_of_week,
    AVG(a.flight_distance) AS average_distance
FROM
    (SELECT 
        dep_month,
    dep_day_of_week,
    dep_date,
    SUM(distance) AS flight_distance
    FROM
        flights
    GROUP BY 1 , 2 , 3) a
GROUP BY 1 , 2
ORDER BY 1 , 2; # 返回每个月中,每个周一、周二……至周日的平均飞行距离

 

1.5
Correlated Subqueries I 相关子查询

 

NB

In a correlated
subquery, the subquery can not be run independently of the outer
query. The order of operations is important in a correlated
subquery:

  1. A row is
    processed in the outer query;

  2. Then, for
    that particular row in the outer query, the subquery is executed.

This means
that for each row processed by the outer query, the subquery will also
be processed for that row.

 

SELECT 
    id
FROM
    flights AS f
WHERE
    distance > (SELECT 
            AVG(distance)
        FROM
            flights
        WHERE
            carrier = f.carrier); # the list of all flights whose distance is above average for their carrier

 

1.6
Correlated Subqueries II

 

In the above
query, the inner query has to be reexecuted for each flight. Correlated
subqueries may appear elsewhere besides the WHERE clause, they can also
appear in the SELECT.

 

SELECT 
    carrier,
    id,
    (SELECT 
            COUNT(*)
        FROM
            flights f
        WHERE
            f.id < flights.id
                AND f.carrier = flights.carrier) + 1 AS flight_sequence_number
FROM
    flights; # 结果集为航空公司,航班id以及序号。相同航空公司的航班,id越大则序号越大

 

相关子查询中,对于外查询执行的每一行,子查询都会为这一行执行一次。在这段代码中,每当外查询提取一行数据中的carrier和id,子查询就会COUNT表中有多少行的carrier与外查询中的行的carrier相同,且id小于外查询中的行,并在COUNT结果上+1,这一结果列别名为flight_sequence_number。于是,id越大的航班,序号就越大。

如果将”<“改为”>”,则id越大的航班,序号越小。

 

 

2.
Set Operation

 

2.2
Union 并集 (only distinct values)

 

Sometimes, we
need to merge two tables together and then query the merged
result.

There are two
ways of doing this:

1) Merge the
rows, called a join.

2) Merge the
columns, called a union.

 

SELECT 
    item_name
FROM
    legacy_products 
UNION SELECT 
    item_name
FROM
    new_products;

 

Each SELECT
statement within the UNION must have the same number of
columns
 with similar data types. The columns in each SELECT
statement must be in the same order. By default, the UNION operator
selects only distinct values.

 

2.3
Union All 并集 (allows duplicate values)

 

SELECT 
    AVG(sale_price) 
FROM
    (SELECT 
        id, sale_price
    FROM
        order_items UNION ALL SELECT 
        id, sale_price
    FROM
        order_items_historic) AS a;

 

2.4
Intersect 交集

 

Microsoft SQL
Server’s INTERSECT returns any distinct values that are returned
by both the query on the left and right sides of the INTERSECT
operand.

 

SELECT category FROM new_products
INTERSECT
SELECT category FROM legacy_products;

 

NB

MySQL不滋瓷INTERSECT,但可以用INNER
JOIN+DISTINCT或WHERE…IN+DISTINCT或WHERE EXISTS实现:

 

SELECT DISTINCT
    category
FROM
    new_products
        INNER JOIN
    legacy_products USING (category);

SELECT DISTINCT
    category
FROM
    new_products
WHERE
    category IN (SELECT 
            category
        FROM
            legacy_products);

 

 

网上很多通过UNION
ALL 实现的办法(如下)是错误的,可能会返回仅在一个表中出现且COUNT(*)
> 1的值:

 

SELECT 
    category, COUNT(*)
FROM
    (SELECT 
        category
    FROM
        new_products UNION ALL SELECT 
        category
    FROM
        legacy_products) a
GROUP BY category
HAVING COUNT(*) > 1;

 

2.5
Except (MS SQL Server) / Minus (Oracle) 差集

 

SELECT category FROM legacy_products
EXCEPT # 在Oracle中为MINUS
SELECT category FROM new_products;

 

NB

MySQL不滋瓷差集,但可以用WHERE…IS
NULL+DISTINCT或WHERE…NOT IN+DISTINCT或WHERE EXISTS实现:

 

SELECT DISTINCT
    category
FROM
    legacy_products
        LEFT JOIN
    new_products USING (category)
WHERE
    new_products.category IS NULL;

SELECT DISTINCT
    category
FROM
    legacy_products
WHERE
    category NOT IN (SELECT 
            category
        FROM
            new_products);

 

 

3.
Conditional Aggregates

 

3.2
NULL

 

use IS NULL or
IS NOT NULL in the WHERE clause to test whether a value is or is not
null.

 

SELECT 
    COUNT(*)
FROM
    flights
WHERE
    arr_time IS NOT NULL
        AND destination = 'ATL';

 

3.3
CASE WHEN “if, then, else”

 

SELECT
    CASE
        WHEN elevation < 250 THEN 'Low'
        WHEN elevation BETWEEN 250 AND 1749 THEN 'Medium'
        WHEN elevation >= 1750 THEN 'High'
        ELSE 'Unknown'
    END AS elevation_tier
    , COUNT(*)
FROM airports
GROUP BY 1;

 

END is required
to terminate the statement, but ELSE is optional. If ELSE is not included, the result
will be NULL.

 

3.4
COUNT(CASE WHEN)

 

count the number
of low elevation airports by state where low elevation is defined as
less than 1000 ft.

SELECT 
    state,
    COUNT(CASE
        WHEN elevation < 1000 THEN 1
        ELSE NULL
    END) AS count_low_elevaton_airports
FROM
    airports
GROUP BY state; 

 

3.5
SUM(CASE WHEN)

 

sum the total
flight distance and compare that to the sum of flight distance from a
particular airline (in this case, Delta) by origin airport. 

SELECT 
    origin,
    SUM(distance) AS total_flight_distance,
    SUM(CASE
        WHEN carrier = 'DL' THEN distance
        ELSE 0
    END) AS total_delta_flight_distance
FROM
    flights
GROUP BY origin; 

 

3.6
Combining aggregates

 

find out the
percentage of flight distance that is from Delta by origin
airport. 

SELECT 
    origin,
    100.0 * (SUM(CASE
        WHEN carrier = 'DL' THEN distance
        ELSE 0
    END) / SUM(distance)) AS percentage_flight_distance_from_delta
FROM
    flights
GROUP BY origin; 

 

3.7
Combining aggregates II

 

Find the
percentage of high elevation airports (elevation >= 2000) by state
from the airports table.

SELECT 
    state,
    100.0 * COUNT(CASE
        WHEN elevation >= 2000 THEN 1
        ELSE NULL
    END) / COUNT(elevation) AS percentage_high_elevation_airports
FROM
    airports
GROUP BY 1;

SELECT 
    state,
    100.0 * SUM(CASE
        WHEN elevation >= 2000 THEN 1
        ELSE 0
    END) / COUNT(elevation) AS percentage_high_elevation_airports
FROM
    airports
GROUP BY 1;

 

 

4.
Date, Number and String Functions

 

MySQL Date
函数:

 

NOW()  
 返回当前的日期和时间

CURDATE()  
 返回当前的日期

CURTIME()  
 返回当前的时间

DATE()  
 提取日期或日期/时间表达式的日期部分

EXTRACT()  
 返回日期/时间的单独部分,比如年、月、日、小时、分钟等等

DATE_ADD()  
 给日期添加指定的时间间隔

DATE_SUB()  
 从日期减去指定的时间间隔

DATEDIFF()  
 返回两个日期之间的天数

DATE_FORMAT()  
 用不同的格式显示日期/时间

 

例1:

 

SELECT NOW(),
CURDATE(), CURTIME();

结果:

NOW()          
                 CURDATE()    CURTIME()

2008-12-29
16:25:46      2008-12-29       16:25:46

 

例2:

 

CREATE TABLE Orders (
    OrderId INT NOT NULL,
    ProductName VARCHAR(50) NOT NULL,
    OrderDate DATETIME NOT NULL DEFAULT NOW (),
    PRIMARY KEY (OrderId)
);

 

OrderDate 列规定
NOW()
作为默认值。作为结果,当您向表中插入行时,当前日期和时间自动插入列中。

 

例3:

 

EXTRACT (unit
FROM date): date参数是合法的日期表达式,unit参数可以是下列值:

DATE_ADD(date,INTERVAL
expr type):
date参数是合法的日期表达式,expr是希望添加的时间间隔,type参数可以是下列值:

DATE_SUB(date,INTERVAL
expr type):
date参数是合法的日期表达式,expr是希望添加的时间间隔,type参数可以是下列值:

 

MICROSECOND

SECOND

MINUTE

HOUR

DAY

WEEK

MONTH

QUARTER

YEAR

SECOND_MICROSECOND

MINUTE_MICROSECOND

MINUTE_SECOND

HOUR_MICROSECOND

HOUR_SECOND

HOUR_MINUTE

DAY_MICROSECOND

DAY_SECOND

DAY_MINUTE

DAY_HOUR

 

例4:

 

DATEDIFF(date1,date2):
date1 和 date2
参数是合法的日期或日期/时间表达式。只有值的日期部分参与计算。

 

例5:

 

DATE_FORMAT(date,format):
date 参数是合法的日期。format 规定日期/时间的输出格式。

可以使用的格式有:

 

SELECT 
    id,
    carrier,
    origin,
    destination,
    DATE_FORMAT(NOW(), '%Y-%c-%d %T') AS datetime
FROM
    flights;

 

4.2
Dates

 

select the date
and time of all deliveries in the baked_goods table using the column
delivery_time.

SELECT 
    DATE(delivery_time), TIME(delivery_time)
FROM
    baked_goods;

 

4.4
Dates III

 

Each of the
baked goods is packaged five hours, twenty minutes, and two days after
the delivery. Create a query returning all the packaging times for the
goods in the baked_goods table.

SELECT 
    DATE_ADD(delivery_time,
        INTERVAL '2 5:20:00' DAY_SECOND) AS package_time
FROM
    baked_goods; 

 

DATE:

 

4.6
Numbers II

 

GREATEST(n1,n2,n3,…):
returns the greatest value in the set of the input numeric
expressions;

LEAST(n1,n2,n3,…):
returns the least value in the set of the input numeric
expressions;

 

Find the
greatest time value for each item.

SELECT 
    id, GREATEST(cook_time, cool_down_time)
FROM
    baked_goods;

 

NB:不同数据类型的比较规则

 

上述命令的结果集为:

图片 1

 

而baked_goods中的cook_time和cool_down_time实际为:

图片 2

 

显然row
2,13和15中的33>5,20>8,45>8,但GREATEST返回的是较小的值

这是因为cook_time和cool_down_time这两列的数据类型是TEXT:

图片 3

 

在GREATEST和LEAST命令中,

当数据类型为TEXT等文本类时,比较的是字符串的大小,即从字符串的首个字符开始比较。’5’比>3’,所以’5′>’33’。’T’>’R’,所以’TORRES’>’RENE’;

当数据类型为INT、BIGINT等数字类时,比较的才是数值的大小。

 

 

4.7
Strings

 

CONCAT(concatenate):

 

Combine the
first_name and last_name columns from the bakeries table as the
full_name.

SELECT 
    CONCAT(first_name, ' ', last_name) AS full_name
FROM
    bakeries;

 

GROUP_CONCAT:

 

Combine the
cities of the three states in city column from bakeries table as
cities.

SELECT 
    state, GROUP_CONCAT(DISTINCT (city)) AS cities
FROM
    bakeries
WHERE
    state IN ('California' , 'New York', 'Texas')
GROUP BY state;

 

 

NB

CONCAT返回结果为连接参数产生的字符串。

如有任何一个参数为NULL
,则返回值为 NULL;

如果所有参数均为非二进制字符串,则结果为非二进制字符串;

如果自变量中含有任一二进制字符串,则结果为一个二进制字符串;

一个数字参数被转化为与之相等的二进制字符串格式;可使用显示类型CAST避免这种情况,例如:

SELECT
CONCAT(CAST(int_col AS CHAR), char_col)

 

CAST:

 

例1:

 

SELECT CAST(12
AS DECIMAL) / 3;

 

Returns the
result as a DECIMAL by casting one of the values as a decimal rather
than an integer.

 

例2:

 

SELECT 
    CONCAT(CAST(distance AS CHAR), ' ', city)
FROM
    bakeries;

 

4.8
Strings II

 

REPLACE(string,from_string,to_string);

The function
returns the string ‘string’ with all occurrences of the string
‘from_string’ replaced by the string ‘to_string’.

 

Replace
‘enriched_flour’ in the ingredients list to just ‘flour’.

SELECT 
    id,
    REPLACE(ingredients,
        'enriched_flour',
        'flour')
FROM
    baked_goods;

 

 

 

III.
SQL: Analyzing Business Metrics

 

 

1.
Advanced Aggregates

 

1.4
Daily Revenue

 

how much we’re
making per day for kale-smoothies.

SELECT 
    DATE(ordered_at), ROUND(SUM(amount_paid), 2)
FROM
    orders
        JOIN
    order_items ON orders.id = order_items.order_id
WHERE
    name = 'kale-smoothie'
GROUP BY 1
ORDER BY 1;

 

1.6
Meal Sums

 

total revenue of
each item.

SELECT 
    name, ROUND(SUM(amount_paid), 2)
FROM
    order_items
GROUP BY name
ORDER BY 2 DESC;

 

1.7
Product Sum 2

 

percent of
revenue each product represents.

SELECT 
    name,
    ROUND(SUM(amount_paid) / (SELECT 
                    SUM(amount_paid)
                FROM
                    order_items) * 100.0,
            2) AS PCT
FROM
    order_items
GROUP BY 1
ORDER BY 2 DESC;

 

Subqueries can
be used to perform complicated calculations and create filtered or
aggregate tables on the fly.

 

1.9
Grouping with Case Statements

 

group the order
items by what type of food they are.

SELECT 
    *,
    CASE name
        WHEN 'kale-smoothie' THEN 'smoothie'
        WHEN 'banana-smoothie' THEN 'smoothie'
        WHEN 'orange-juice' THEN 'drink'
        WHEN 'soda' THEN 'drink'
        WHEN 'blt' THEN 'sandwich'
        WHEN 'grilled-cheese' THEN 'sandwich'
        WHEN 'tikka-masala' THEN 'dinner'
        WHEN 'chicken-parm' THEN 'dinner'
        ELSE 'other'
    END AS category
FROM
    order_items
ORDER BY id;

 

look at percents
of purchase by category:

SELECT 
    CASE name
        WHEN 'kale-smoothie' THEN 'smoothie'
        WHEN 'banana-smoothie' THEN 'smoothie'
        WHEN 'orange-juice' THEN 'drink'
        WHEN 'soda' THEN 'drink'
        WHEN 'blt' THEN 'sandwich'
        WHEN 'grilled-cheese' THEN 'sandwich'
        WHEN 'tikka-masala' THEN 'dinner'
        WHEN 'chicken-parm' THEN 'dinner'
        ELSE 'other'
    END AS category,
    ROUND(1.0 * SUM(amount_paid) / (SELECT 
                    SUM(amount_paid)
                FROM
                    order_items) * 100,
            2) AS PCT
FROM
    order_items
GROUP BY 1
ORDER BY 2 DESC;

 

NB

Here 1.0 * is
a shortcut to ensure the database represents the percent as a
decimal.

 

1.11
Recorder Rates

 

We’ll define
reorder rate as the ratio of the total number of orders to the number of
people making those orders. A lower ratio means most of the orders are
reorders. A higher ratio means more of the orders are first
purchases.

 

SELECT 
    name,
    ROUND(1.0 * COUNT(DISTINCT order_id) / COUNT(DISTINCT delivered_to),
            2) AS reorder_rate
FROM
    order_items
        JOIN
    orders ON orders.id = order_items.order_id
GROUP BY 1
ORDER BY 2 DESC;

 

 

2.
Common Metrics

 

2.2
Daily Revenue

 

SELECT 
    DATE(created_at), ROUND(SUM(price), 2)
FROM
    purchases
GROUP BY 1
ORDER BY 1;

 

2.3
Daily Revenue 2

 

Update our daily
revenue query to exclude refunds.

SELECT 
    DATE(created_at), ROUND(SUM(price), 2) AS daily_rev
FROM
    purchases
WHERE
    refunded_at IS NULL
GROUP BY 1
ORDER BY 1;

 

这里Codecademy的Instructions和代码识别都要求“WHERE
refunded_at IS NOT NULL”,应该是写错了。

 

2.4
Daily Active Users

 

Calculate
DAU

SELECT 
    DATE(created_at), COUNT(DISTINCT user_id) AS DAU
FROM
    gameplays
GROUP BY 1
ORDER BY 1;

 

2.5
Daily Active Users 2

 

Calculate DAU
per-platform

SELECT 
    DATE(created_at), platform, COUNT(DISTINCT user_id) AS DAU
FROM
    gameplays
GROUP BY 1 , 2
ORDER BY 1 , 2;

 

2.6
Daily Average Revenue Per Paying User(ARPPU)

 

Calculate Daily
ARPPU

SELECT 
    DATE(created_at),
    ROUND(SUM(price) / COUNT(DISTINCT user_id), 2) AS ARPPU
FROM
    purchases
WHERE
    refunded_at IS NULL
GROUP BY 1
ORDER BY 1;

 

2.8
ARPU 2

 

One way to
create and organize temporary results in a query is with CTEs, Common
Table Expressions, aka WITH … AS clauses. The WITH … AS clauses make
it easy to define and use results in a more organized way than
subqueries.

 

NB

MySQL不滋瓷CTE。可能可以用临时表实现,待验证。(?)

 

Calculate Daily
Revenue

WITH daily_revenue AS 
    (
    SELECT 
        date(created_at) AS dt,
        ROUND(SUM(price), 2) AS rev
    FROM
        purchases
    WHERE refunded_at IS NULL
    GROUP BY 1
    )
SELECT 
    * 
FROM 
    daily_revenue 
ORDER BY dt;

 

2.9
ARPU 3

 

Calculate Daily
ARPU

WITH daily_revenue AS (
  SELECT
    DATE(created_at) AS dt,
    ROUND(SUM(price), 2) AS rev
  FROM purchases
  WHERE refunded_at IS NULL
  GROUP BY 1
), 
daily_players AS (
  SELECT
    DATE(created_at) AS dt,
    COUNT(DISTINCT user_id) AS players
  FROM gameplays
  GROUP BY 1
)
SELECT
  daily_revenue.dt,
  daily_revenue.rev / daily_players.players
FROM daily_revenue
  JOIN daily_players USING (dt);

 

2.12
1 Day Retention 2

 

SELF
JOIN:

By using a
self-join, we can make multiple gameplays available on the same row of
results. This will enable us to calculate retention.

The power of
self-join comes from joining every row to every other row. This makes it
possible to compare values from two different rows in the new result
set.

 

SELECT 
    DATE(g1.created_at) AS dt, g1.user_id
FROM
    gameplays AS g1
        JOIN
    gameplays AS g2 ON g1.user_id = g2.user_id
ORDER BY 1
LIMIT 100;

 

2.13
1 Day Retention 3

 

Calculate 1 Day
Retention Rate

SELECT 
    DATE(g1.created_at) AS dt,
    ROUND(100 * COUNT(DISTINCT g2.user_id) / COUNT(DISTINCT g1.user_id)) AS retention
FROM
    gameplays AS g1
        LEFT JOIN
    gameplays AS g2 ON g1.user_id = g2.user_id
        AND DATE(g1.created_at) = DATE(DATE_SUB(g2.created_at, INTERVAL 1 DAY))
GROUP BY 1
ORDER BY 1;

 

NB

游戏行业中,公认的次日留存率(1
Day Retention
Rate)定义为:DNU在次日再次登录的比例。而本题计算的是:DAU在次日再次登录的比例。

 

 

 

IV.
附录

 

 

1.
DISTINCT和GROUP BY的去重逻辑浅析

 

SELECT 
    COUNT(DISTINCT amount_paid)
FROM
    order_items;

SELECT 
    COUNT(1)
FROM
    (SELECT 
        1
    FROM
        order_items
    GROUP BY amount_paid) a;

SELECT 
    SUM(1)
FROM
    (SELECT 
        1
    FROM
        order_items
    GROUP BY amount_paid) a;

 

分别是在运算和存储上的权衡:

DISTINCT需要将列中的全部内容存储在一个内存中,将所有不同值存起来,内存消耗可能较大;

GROUP
BY先将列排序,排序的基本理论是:时间复杂为nlogn,空间为1。优点是空间复杂度小,(?)缺点是执行时间会较长。

 

使用时根据具体情况取舍:数据分布离散时,使用GROUP
BY;数据分布集中时,使用DISTINCT,效率高,空间占用较小。

 

 

2.
SELECT 1 FROM table

 

1是一常量(可以为任意数值),查到的所有行的值都是它,但从效率上来说,1>column
name>*,因为不用查字典表(?)。

没有特殊含义,只要有数据就返回1,没有则返回NULL。

常用于EXISTS、子查询中,一般在判断子查询是否成功(即是否有满足条件)时使用,如:

SELECT 
    *
FROM
    orders
WHERE
    EXISTS( SELECT 
            1
        FROM
            orders o
                JOIN
            order_items i ON o.id = i.order_id);

 

 

3.
WHERE 1=1和WHERE 1=0的作用

 

3.1
WHERE 1=1:条件恒真,同理WHERE ‘a’ =
‘a’等。在构造动态SQL语句,如:不定数量查询条件时,1=1可以很方便地规范语句:

 

3.1.1
制作查询页面时,若可查询的选项有多个,且用户可自行选择并输入关键词,那么按照平时的查询语句的动态构造,代码大致如下:

MySqlStr="SELECT * FROM table WHERE";

  IF(Age.Text.Lenght>0)
  {
    MySqlStr=MySqlStr+"Age="+"'Age.Text'";
  }

  IF(Address.Text.Lenght>0)
  {
    MySqlStr=MySqlStr+"AND Address="+"'Address.Text'";
  }

 

1)如果上述的两个IF判断语句,均为True,即用户都输入了查询词,那么,最终的MySqlStr动态构造语句变为:

MySqlStr=”SELECT
* FROM table WHERE Age=’27’ AND
Address=’广东省深圳市南山区科兴科学园'”语句完整,能够被正确执行;

2)如果两个IF都不成立,MySqlStr=”SELECT
* FROM table WHERE”;,语句错误,无法执行。

 

3.1.2 使用WHERE
1=1:

 

1)MySqlStr=”SELECT *
FROM table WHERE 1=1 AND Age=’27’ AND
Address=’广东省深圳市南山区科兴科学园'”;,正确可执行;

2)MySqlStr=”SELECT *
FROM table WHERE 1=1″;,由于WHERE
1=1恒真,该语句能够被正确执行,作用相当于:MySqlStr=”SELECT * FROM
table”;

 

也就是说:如果用户在多条件查询页面中,不选择任何字段、不输入任何关键词,那么,必将返回表中所有数据;如果用户在页面中,选择了部分字段并且输入了部分查询关键词,那么,就按用户设置的条件进行查询。

 

WHERE
1=1仅仅只是为了满足多条件查询页面中不确定的各种因素而采用的一种构造一条正确能运行的动态SQL语句的一种方法。

 

3.2
WHERE 1=0:条件恒假,同理WHERE 1 <>
1等。不会返回任何数据,只有表结构,可用于快速建表:

 

3.2.1
用于读取表的结构而不考虑表中的数据,这样节省了内存,因为可以不用保存结果集:

SELECT 
    *
FROM
    table
WHERE
    1 = 0; 

 

3.2.2
创建一个新表,新表的结构与查询的表的结构相同:

CREATE TABLE newtable AS SELECT * FROM
    oldtable
WHERE
    1 = 0;  

 

 

 

4.
复制表

 

CREATE TABLE newtable AS SELECT * FROM
    oldtable;  

 

5.
Having

 

NB

在 SQL 中增加
HAVING 子句原因是,WHERE
关键字无法与聚合函数(aggregate
function)一起使用。常用的聚合函数有:COUNT,SUM,AVG,MAX,MIN等。

 

例1:查找订单总金额少于
2000 的客户

 

SELECT 
    Customer, SUM(OrderPrice)
FROM
    Orders
GROUP BY Customer
HAVING SUM(OrderPrice) < 2000;

 

例2:查找客户
“Bush” 或 “Adams” 拥有超过 1500 的订单总金额

 

SELECT 
    Customer, SUM(OrderPrice)
FROM
    Orders
WHERE
    Customer = 'Bush' OR Customer = 'Adams'
GROUP BY Customer
HAVING SUM(OrderPrice) > 1500;

 

 

 

6.
USING

 

It is mostly
syntactic sugar, but a couple differences are noteworthy:

 

ON is the more
general of the two. One can JOIN tables ON a column, a set of columns
and even a condition. For example:

SELECT 
    *
FROM
    world.City
        JOIN
    world.Country ON (City.CountryCode = Country.Code) 
WHERE ...

 

USING is useful
when both tables share a column of the exact same name on which they
join. In this case, one may say:

SELECT 
    film.title, film_id # film_id is not prefixed
FROM
    film
        JOIN
    film_actor USING (film_id)
WHERE ...

 

To do the above
with ON, we would have to write:

SELECT 
    film.title, film.film_id # film.film_id is required here
FROM
    film
        JOIN
    film_actor ON (film.film_id = film_actor.film_id)
WHERE ...

 

NB film.film_id
qualification in the SELECT clause. It would be invalid to just say
film_id
 since that would make for an ambiguity.

 

 

 

7.
IFNULL()和COALESCE()函数 用于规定如何处理NULL值

 

假如
“UnitsOnOrder” 列可选,且包含 NULL 值。使用:

SELECT
    ProductName, UnitPrice * (UnitsInStock + UnitsOnOrder)
FROM
    Products;

 

由于
“UnitsOnOrder” 列存在NULL值,那么结果是 NULL。

为了便于计算,我们希望如果值是NULL,则返回0:

SELECT
    ProductName, UnitPrice * (UnitsInStock + IFNULL(UnitsOnOrder, 0))
FROM
    Products;

SELECT
    ProductName, UnitPrice * (UnitsInStock + COALESCE(UnitsOnOrder, 0))
FROM
    Products;

 

 

8.
UCASE()和LCASE()函数 字段值得大小写转换

 

SELECT 
    LCASE(first_name) AS first_name,
    last_name,
    city,
    UCASE(state) AS state
FROM
    bakeries;

 

9.
MID()函数 用于从文本字段中提取字符

 

SELECT 
    MID(column_name, start, length)
FROM
    table_name;

 

column_name:
必需。要提取字符的字段。

start:
必需。规定开始位置(起始值是 1)。

length:
可选。要返回的字符数。如果省略,则 MID() 函数返回剩余文本。

 

SELECT 
    MID(state, 1, 3) AS smallstate
FROM
    bakeries;

 

10.
LENGTH()函数 返回文本字段中 值的长度

 

SELECT 
    LENGTH(City) AS LengthOfCity
FROM
    bakeries;

 

11.
WHERE EXISTS命令

 

11.1
WHERE EXISTS与WHERE NOT EXISTS

 

EXISTS:判断子查询得到的结果集是否是一个空集,如果不是,则返回
True,如果是,则返回
False。即:如果在当前的表中存在符合条件的这样一条记录,那么返回
True,否则返回 False。

 

NOT
EXISTS:作用与 EXISTS 正相反,当子查询的结果为空集时,返回
True,反之返回 False。也就是所谓的”若不存在“。

 

EXISTS和NOT
EXISTS所在的查询属于相关子查询,即:对于外层父查询中的每一行,都执行一次子查询。先取父查询的第一个元组,根据它与子查询相关的属性处理子查询,若子查询WHERE条件表达式成立,则表达式返回True,则将此元组放入结果集,以此类推,直到遍历父查询表中的所有元组。

 

 

查询选修了任意课程的学生的姓名:

SELECT 
    Sname
FROM
    student
WHERE
    EXISTS( SELECT 
            *
        FROM
            sc,
            course
        WHERE
            sc.Sno = student.Sno
                AND sc.Cno = course.Cno);

 

查询未被200215123号学生选修的课程的课名:

SELECT 
    Cname
FROM
    course
WHERE
    NOT EXISTS( SELECT 
            *
        FROM
            sc
        WHERE
            Sno = 200215123
                AND Cno = course.Cno);

 

11.2
WHERE EXISTS与WHERE NOT EXISTS的双层嵌套

 

1)查询选修了全部课程的学生的课程:

SELECT 
    Sname
FROM
    student
WHERE
    NOT EXISTS( SELECT 
            *
        FROM
            course
        WHERE
            NOT EXISTS( SELECT 
                    *
                FROM
                    sc
                WHERE
                    Sno = student.Sno AND Cno = course.Cno));

 

思路:

STEP1:先取
Student 表中的第一个元组,得到其 Sno 列的值。

STEP2:再取
Course 表中的第一个元组,得到其 Cno 列的值。

STEP3:根据 Sno
与 Cno 的值,遍历 SC 表中的所有记录(也就是选课记录)。若对于某个 Sno 和
Cno 的值来说,在 SC 表中找不到相应的记录,则说明该 Sno
对应的学生没有选修该 Cno 对应的课程。

STEP4:对于某个学生来说,若在遍历
Course
表中所有记录(也就是所有课程)后,仍找不到任何一门他/她没有选修的课程,就说明此学生选修了全部的课程。

STEP5:将此学生放入结果元组集合中。

STEP6:回到
STEP1,取 Student 中的下一个元组。

STEP7:将所有结果元组集合显示。

其中第一个 NOT
EXISTS 对应 STEP4,第二个 NOT EXISTS 对应 STEP3。

 

2)查询被所有学生选修的课程的课名:

SELECT 
    Cname
FROM
    course
WHERE
    NOT EXISTS( SELECT 
            *
        FROM
            student
        WHERE
            NOT EXISTS( SELECT 
                    *
                FROM
                    sc
                WHERE
                    Cno = course.Cno AND Sno = student.Sno));

 

3)查询选修了200215123号学生选修的全部课程的学生的学号:

SELECT 
    DISTINCT Sno
FROM
    sc scx
WHERE
    NOT EXISTS( SELECT 
            *
        FROM
            sc scy
        WHERE scy.Sno = 200215123
           AND NOT EXISTS( SELECT 
                    *
                FROM
                    sc
                WHERE
                    Sno = scx.Sno AND Cno = scy.Cno));

 

 

扩展:

图片 4

 

  1. Condition
    1:TA会C2。Condition 2:TA选修了课程 =>
    exists+exists:查询选修了任意课程的学生;

  2. Condition
    1:TA不会C2。Condition 2:TA选修了课程 => not
    exists+exists:查询未选修任何课程的学生;

  3. Condition
    1:TA不会C2。Condition 2:TA有课程课没选 => not exists+not
    exists:查询选修了所有课程的学生;

  4. Condition
    1:TA会C2。Condition 2:TA有课程没选 => exists+not
    exists:查询未选修所有课程的学生;

 

11.3
WHERE EXISTS与WHERE IN的比较与使用

 

SELECT 
    Sname
FROM
    student
WHERE
    EXISTS( SELECT 
            *
        FROM
            sc,
            course
        WHERE
            sc.Sno = student.Sno
                AND sc.Cno = course.Cno
                AND course.Cname = '操作系统');

 

SELECT 
    Sname
FROM
    student
WHERE
    Sno IN (SELECT 
            Sno
        FROM
            sc,
            course
        WHERE
            sc.Sno = student.Sno
                AND sc.Cno = course.Cno
                AND course.Cname = '操作系统');

 

以上两个查询都返回选修了“操作系统”课程的学生的姓名,但原理不同:

EXISTS:对外表做loop循环,每次loop循环再对内表进行查询。首先检查父查询,然后运行子查询,直到它找到第一个匹配项。

IN:把内表和外表做hash连接。首先执行子查询,并将获得的结果存放在一个加了索引的临时表中。在执行子查询前,系统先将父查询挂起,待子查询执行完毕,存放在临时表中时以后再执行父查询。

 

因此:

1)如果两个表中一个较小,一个是较大,则子查询表大的用EXISTS,子查询表小的用IN

2)在查询的两个表大小相当时,3种查询方式的执行时间通常是:

EXISTS <=
IN <= JOIN

NOT EXISTS
<= NOT IN <= LEFT JOIN

只有当表中字段允许NULL时,NOT
IN最慢:

NOT EXISTS <=
LEFT JOIN <= NOT IN

3)无论哪个表大,NOT
EXISTS都比NOT IN快。
因为NOT
IN会对内外表都进行全表扫描,没有用到索引;而NOT
EXISTS的子查询依然能用到表上的索引。

 

 

 

12.
特殊字符单引号的处理

 

转义符号:代表下一个字符为文本字符串的一部分,而非字符串的开始、结束或其他指令

 

用反斜线转义引号:(在查询出错时可以轻易发现多余单引号的位置)

SELECT 
    *
FROM
    my_contacts
WHERE
    location = 'Grover's Mill, NJ';

 

用另一个单引号转义引号:

SELECT 
    *
FROM
    my_contacts
WHERE
    location = 'Grover's Mill, NJ';

 

 

13.
REGEXP 正则表达式匹配查询

 

^:匹配输入字符串的开始位置

$:匹配输入字符串的结束位置

.:匹配除”n”之外的任何单个字符

[…]:字符集合。'[abc]’可以匹配’plain’中的’a’

[^…]:负值字符集合

p1|p2|p3:匹配p1或p2或p3。’z|food’匹配’z’或’food’,'(z|f)ood匹配’zood’或’food’

*:匹配0或多次。’zo*’能匹配’z’或’zoo’,*等价于{0,}

+:匹配1或多次。’zo+’能匹配’zo’或’zoo’,+等价于{1,}

?:匹配0或1次。’zo’能匹配’z’或’zo’,?等价于{0,1}

{n}:n为非负整数,代表匹配n次

{n,m}:m,
n均为非负整数,且n<=m,代表最少匹配n次,最多匹配m次

‘[charlist]’
any individual character in string: WHERE city REGEXP ‘[ALN]’
以“A”或”L“或”N“开头的城市

‘[^charlist]’
any individual character not in string: WHERE city REGEXP ‘[^ALN]’
不以“A”或”L“或”N“开头的城市

 

查找word字段中以acg字符开头或以’vity’字符串结尾的所有数据:

SELECT 
    word
FROM
    dictionary
WHERE
    word REGEXP '^[acg]|vity$'; 

 

NB

正则表达式无法准确匹配'[…]’中的汉字,不知道是因为MySQL编译时以ISO-8859作为字符集,还是因为数据库字符未设置为GBK。

 

 

14.
以时间段为单位取数和FLOOR()函数(向下取整)

 

上周五(Feb 24,
2017)遇到的面试题:以5分钟为单位取数。

有一个如下的表,以5分钟为单位计算有多少个不重复user_id生成。

+————-+———-+——+—–+———+——-+

| Field | Type |
Null | Key | Default | Extra |

+————-+———-+——+—–+———+——-+

| id | int(11) |
NO | PRI | NULL | |

| user_id |
int(11) | YES | | NULL | |

| price | double
| YES | | NULL | |

| refunded_at |
text | YES | | NULL | |

| created_at |
datetime | YES | | NULL | |

+————-+———-+——+—–+———+——-+ 

SELECT 
    STR_TO_DATE(DATE_FORMAT(FLOOR(created_at / 500) * 500,
                    '%Y-%c-%d %T'),
            '%Y-%c-%d %T') AS every_five_minutes,
    COUNT(DISTINCT user_id) AS count_user_id
FROM
    purchases
GROUP BY 1
ORDER BY 1;

思路:

1)DATETIME类型的字段created_at在除以500,再乘以500后会变为数值,2015年8月4日
03:47:44变为20150804034744.0000。因此在created_at/500时,使用FLOOR函数向下取整以抹平余数,再乘以500就能够得到以500为间隔的数字;

2)使用DATE_FORMAT将数字转化为日期;

3)由于DATE_FORMAT转化的日期为文本字符串,不便于排序。因此,再使用STR_TO_DATE转化为日期格式。

 

 

15.
转置表的行和列

 

同样是上周五的面试题:转置任意一个表的行和列。

假设有一个如下的course表:

+—–+————–+——+———+

| Cno | Cname |
Cpno | Ccredit |

+—–+————–+——+———+

| 1 | 数据库 | 5
| 4 |

| 2 | 数学 |
NULL | 2 |

| 3 | 信息系统 |
1 | 4 |

| 4 | 操作系统 |
6 | 3 |

| 5 | 数据结构 |
7 | 4 |

| 6 | 数据处理 |
NULL | 2 |

| 7 | PASCAL语言
| 6 | 4 |

+—–+————–+——+———+

要使其变为:

图片 5

可以使用SUM(IF())或CASE
WHEN函数。

SELECT 
    c2 AS '课程',
    SUM(IF(c1 = '数据库', c3, NULL)) AS '数据库',
    SUM(IF(c1 = '数学', c3, NULL)) AS '数学',
    SUM(IF(c1 = '信息系统', c3, NULL)) AS '信息系统',
    SUM(IF(c1 = '操作系统', c3, NULL)) AS '操作系统',
    SUM(IF(c1 = '数据结构', c3, NULL)) AS '数据结构',
    SUM(IF(c1 = '数据处理', c3, NULL)) AS '数据处理',
    SUM(IF(c1 = 'PASCAL语言', c3, NULL)) AS 'PASCAL语言'
FROM
    (SELECT 
        Cname AS c1, 'Cpno' AS c2, Cpno AS c3
    FROM
        course ca UNION ALL SELECT 
        Cname, 'Ccredit' AS c2, Ccredit AS c3
    FROM
        course cb) AS row_column_convert
GROUP BY 1
ORDER BY c2 DESC;

SELECT 
    c2 AS '课程',
    SUM(CASE
        WHEN c1 = '数据库' THEN c3
        ELSE NULL
    END) AS '数据库',
    SUM(CASE
        WHEN c1 = '数学' THEN c3
        ELSE NULL
    END) AS '数学',
    SUM(CASE
        WHEN c1 = '信息系统' THEN c3
        ELSE NULL
    END) AS '信息系统',
    SUM(CASE
        WHEN c1 = '操作系统' THEN c3
        ELSE NULL
    END) AS '操作系统',
    SUM(CASE
        WHEN c1 = '数据结构' THEN c3
        ELSE NULL
    END) AS '数据结构',
    SUM(CASE
        WHEN c1 = '数据处理' THEN c3
        ELSE NULL
    END) AS '数据处理',
    SUM(CASE
        WHEN c1 = 'PASCAL语言' THEN c3
        ELSE NULL
    END) AS 'PASCAL语言'
FROM
    (SELECT 
        Cname AS c1, 'Cpno' AS c2, Cpno AS c3
    FROM
        course ca UNION ALL SELECT 
        Cname, 'Ccredit' AS c2, Ccredit AS c3
    FROM
        course cb) AS row_column_convert
GROUP BY 1
ORDER BY c2 DESC;

 

 

 

 

 

 

 

The trouble is,
you think you have time.

Leave a Comment.