[MySQL/HackerRank]Aggregation
문제 링크: https://www.hackerrank.com/domains/sql?filters%5Bsubdomains%5D%5B%5D=aggregation
Solve SQL Code Challenges
A special-purpose language designed for managing data held in a relational database.
www.hackerrank.com
Aggregation까지 다 풀고 나니 실버 4가 됐다. Median을 구하는 'Weather Observation Station 20' 문제 빼고는 평이한 난이도였다.
-- Revising Aggregations - The Count Function
SELECT COUNT(ID) FROM CITY WHERE POPULATION > 100000
-- Revising Aggregations - The Sum Function
SELECT SUM(POPULATION) FROM CITY WHERE DISTRICT = 'California'
-- Revising Aggregations - Averages
SELECT AVG(POPULATION) FROM CITY WHERE DISTRICT = 'California'
-- Average Population
-- 마침 round 해야하는 문제가 나왔다
SELECT ROUND(AVG(POPULATION)) FROM CITY
-- Japan Population
SELECT SUM(POPULATION) FROM CITY WHERE COUNTRYCODE = 'JPN'
-- Population Density Difference
SELECT MAX(POPULATION)-MIN(POPULATION) FROM CITY
-- The Blunder
SELECT CEIL(AVG(SALARY)-AVG(REPLACE(SALARY, '0', ''))) FROM EMPLOYEES
1. 반올림 round(숫자, 자릿수)
2. 제거(버림) truncate(숫자, 자릿수)
3. 올림 ceil(숫자, 자릿수)
4. column 값 수정 replace(column_name, 바꿀 값, 바뀔 값)
-- Top Earners
-- 어려울 이유는 없는데 헷갈린 문제
-- GROUP BY
SELECT SALARY*MONTHS AS EARNINGS, COUNT(*) FROM EMPLOYEE GROUP BY EARNINGS ORDER BY EARNINGS DESC LIMIT 1
이 문제를 풀고나서 Discussion을 봤더니 가장 많은 추천을 받은 답이 'GROUP BY 1'을 사용했었다. 왜 GROUP BY에 속성명을 사용하지 않고, 숫자를 사용했는지 궁금해서 확인해봤다.
Why do we use Group by 1 and Group by 1,2,3 in SQL query?
In SQL queries, we do use Group by clause to apply aggregate functions. But what is the purpose behind using numeric value instead of column name with Group by clause? For example: Group by 1.
dba.stackexchange.com
나쁜 쿼리 작성 방식이고, 인수인계는 물론이고 데이터베이스 구조가 바뀔 때에도 좋지 않다가 결론이었다.
-- Weather Observation Station 2
SELECT ROUND(SUM(LAT_N), 2), ROUND(SUM(LONG_W), 2) FROM STATION
-- Weather Observation Station 13
SELECT TRUNCATE(SUM(LAT_N), 4) FROM STATION WHERE LAT_N > 38.7880 AND LAT_N < 137.2345
-- Weather Observation Station 14
SELECT TRUNCATE(LAT_N, 4) FROM STATION WHERE LAT_N < 137.2345 ORDER BY LAT_N DESC LIMIT 1
-- Weather Observation Station 15
SELECT ROUND(LONG_W, 4) FROM STATION WHERE LAT_N < 137.2345 ORDER BY LAT_N DESC LIMIT 1
-- Weather Observation Station 16
SELECT ROUND(LAT_N, 4) FROM STATION WHERE LAT_N > 38.7780 ORDER BY LAT_N ASC LIMIT 1
-- Weather Observation Station 17
SELECT ROUND(LONG_W, 4) FROM STATION WHERE LAT_N > 38.7780 ORDER BY LAT_N ASC LIMIT 1
-- Weather Observation Station 18
-- 맨하탄 거리
SELECT ROUND((MAX(LAT_N)-MIN(LAT_N))+(MAX(LONG_W)-MIN(LONG_W)), 4) FROM STATION
-- Weather Observation Station 19
-- 유클리드 거리
SELECT TRUNCATE(
POWER(POWER(MAX(LAT_N) - MIN(LAT_N), 2) + POWER(MAX(LONG_W) - MIN(LONG_W), 2), 0.5)
, 4)
FROM STATION
맨해튼 거리는 원래 ABS()를 사용해야 하지만, LAT_N의 모든 값이 양수인 상황에서 MAX(LAT_N)-MIN(LAT_N)이 음수가 될 수 없으므로 사용하지 않았다.