SQL Server : select distinct data with transpose - sql

I have data in a table like this:
+-----------+-------------+-----------------+------+----------+
| make | model | variant | year | price |
+-----------+-------------+-----------------+------+----------+
| CHEVROLET | SPARK | 08LS | 2018 | 1000000 |
| CHEVROLET | SPARK | 08LS | 2017 | 2000000 |
| CHEVROLET | SPARK | 08LS | 2016 | 3000000 |
| CHEVROLET | SPARK | 08LSTRENDY | 2018 | 4000000 |
| CHEVROLET | SPARK | 08LSTRENDY | 2017 | 5000000 |
| CHEVROLET | SPARK | 08LSTRENDY | 2016 | 6000000 |
| TOYOTA | LANDCRUISER | 10042DVX | 2018 | 7000000 |
| TOYOTA | LANDCRUISER | 10042DVX | 2017 | 8000000 |
| TOYOTA | LANDCRUISER | 10042DVX | 2016 | 9000000 |
| TOYOTA | LANDCRUISER | 10042DVX | 2015 | 10000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2018 | 11000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2017 | 12000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2016 | 13000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2015 | 14000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2014 | 15000000 |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 2013 | 16000000 |
+-----------+-------------+-----------------+------+----------+
I want to select data in below form
+-----------+-------------+-----------------+----------+----------+----------+----------+----------+----------+
| make | model | variant | 2018 | 2017 | 2016 | 2015 | 2014 | 2013 |
+-----------+-------------+-----------------+----------+----------+----------+----------+----------+----------+
| CHEVROLET | SPARK | 08LS | 1000000 | 2000000 | 3000000 | NULL | NULL | NULL |
| CHEVROLET | SPARK | 08LSTRENDY | 4000000 | 5000000 | 6000000 | NULL | NULL | NULL |
| TOYOTA | LANDCRUISER | 10042DVX | 7000000 | 8000000 | 9000000 | 10000000 | NULL | NULL |
| TOYOTA | LANDCRUISER | 10042DVXLIMITED | 11000000 | 12000000 | 13000000 | 14000000 | 15000000 | 16000000 |
+-----------+-------------+-----------------+----------+----------+----------+----------+----------+----------+
Could you please help me to write a query for this?

You can do conditional aggregation :
select make, model, variant,
sum(case when yr = 2018 then price else 0 end),
sum(case when yr = 2017 then price else 0 end),
sum(case when yr = 2016 then price else 0 end),
sum(case when yr = 2015 then price else 0 end)
from table t
group by make, model, variant;

Related

Grouping by a column to compare values between similar rows

I'm trying to turn this
+----+---------+-------------------+-----------+
| id | year | desc | amount |
+----+---------+-------------------+-----------+
| 1 | 2017 | car | 500 |
| 2 | 2017 | car | 550 |
| 1 | 2018 | car | 490 |
| 2 | 2018 | car | 550 |
| 1 | 2017 | house | 200 |
| 2 | 2017 | house | 300 |
| 1 | 2018 | house | 210 |
| 2 | 2018 | house | 320 |
| 1 | 2019 | house | 290 |
| 2 | 2019 | house | 325 |
+----+---------+-------------------+-----------+
Into something like this
+----+---------+---------+-------------------+-----------+-----------+
| id | year_0 | year_1 | desc | amount_0 | amount_1 |
+----+---------+---------+-------------------+-----------+-----------+
| 1 | 2017 | 2018 | car | 500 | 490 |
| 2 | 2017 | 2018 | car | 550 | 550 |
| 1 | 2017 | 2018 | house | 200 | 210 |
| 2 | 2017 | 2018 | house | 300 | 320 |
+----+---------+---------+-------------------+-----------+-----------+
But I'm having difficulty getting the two years and two amounts to group by description.
You can achieve the result by applying join:
SELECT A.id,a.year year_0,b.year year_1, A.[desc], A.amount amount_0,B.amount amount_1
FROM
(SELECT * FROM YourTable WHERE Year= Datepart(year,GETDATE())-1) AS A
INNER JOIN
(SELECT * FROM YourTable WHERE Year= Datepart(year,GETDATE())) AS B
ON A.id=B.id AND A.[desc]=B.[desc]

Why do you need to include a field in GROUP BY when using OVER (PARTITION BY x)?

I have a table for which I want to do a simple sum of a field, grouped by two columns. I then want the total for all values for each year_num.
See example: http://rextester.com/QSLRS68794
This query is throwing: "42803: column "foo.num_cust" must appear in the GROUP BY clause or be used in an aggregate function", and I cannot figure out why. Why would an aggregate function using the OVER (PARTITION BY x) require the summed field to be in GROUP BY??
select
year_num
,age_bucket
,sum(num_cust)
--,sum(num_cust) over (partition by year_num) --THROWS ERROR!!
from
foo
group by
year_num
,age_bucket
order by 1,2
TABLE:
| loc_id | year_num | gen | cust_category | cust_age | num_cust | age_bucket |
|--------|-----------|------|----------------|-----------|-----------|-------------|
| 1 | 2016 | M | cash | 41 | 2 | 04_<45 |
| 1 | 2016 | F | Prepaid | 41 | 1 | 03_<35 |
| 1 | 2016 | F | cc | 61 | 1 | 05_45+ |
| 1 | 2016 | F | cc | 19 | 2 | 02_<25 |
| 1 | 2016 | M | cc | 64 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 46 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 27 | 3 | 03_<35 |
| 1 | 2016 | M | cash | 42 | 1 | 04_<45 |
| 1 | 2017 | F | cc | 35 | 1 | 04_<45 |
| 1 | 2017 | F | cc | 37 | 1 | 04_<45 |
| 1 | 2017 | F | cash | 46 | 1 | 05_45+ |
| 1 | 2016 | F | cash | 19 | 4 | 02_<25 |
| 1 | 2017 | M | cash | 43 | 1 | 04_<45 |
| 1 | 2017 | M | cash | 29 | 1 | 03_<35 |
| 1 | 2016 | F | cc | 13 | 1 | 01_<18 |
| 1 | 2017 | F | cash | 16 | 2 | 01_<18 |
| 1 | 2016 | F | cc | 17 | 2 | 01_<18 |
| 1 | 2016 | M | cc | 17 | 2 | 01_<18 |
| 1 | 2017 | F | cash | 18 | 9 | 02_<25 |
DESIRED OUTPUT:
| year_num | age_bucket | sum | sum over (year_num) |
|----------|------------|-----|---------------------|
| 2016 | 01_<18 | 5 | 21 |
| 2016 | 02_<25 | 6 | 21 |
| 2016 | 03_<35 | 4 | 21 |
| 2016 | 04_<45 | 3 | 21 |
| 2016 | 05_45+ | 3 | 21 |
| 2017 | 01_<18 | 2 | 16 |
| 2017 | 02_<25 | 9 | 16 |
| 2017 | 03_<35 | 1 | 16 |
| 2017 | 04_<45 | 3 | 16 |
| 2017 | 05_45+ | 1 | 16 |
You need to nest the sum()s:
select year_num, age_bucket, sum(num_cust),
sum(sum(num_cust)) over (partition by year_num) --WORKS!!
from foo
group by year_num, age_bucket
order by 1, 2;
Why? Well, the window function is not doing aggregation. The argument needs to be an expression that can be evaluated after the group by (because this is an aggregation query). Because num_cust is not a group by key, it needs an aggregation function.
Perhaps this is clearer if you used a subquery:
select year_num, age_bucket, sum_num_cust,
sum(sum_num_cust) over (partition by year_num)
from (select year_num, age_bucket, sum(num_cust) as sum_num_cust
from foo
group by year_num, age_bucket
) ya
order by 1, 2;
These two queries do exactly the same thing. But with the subquery it should be more obvious why you need the extra aggregation.

How to select tuples that doesn't match two criterias in the same query?

Let's supose I have the following table list_level
| year | cat_id | user_id | id | val_1 | val_2 |
|------|--------|---------|------|-------|--------|
| 2017 | 2 | 141256 | 1501 | ABC | <null> |
| 2017 | 2 | 141256 | 1023 | DRF | <null> |
| 2017 | 1 | 141256 | 882 | TGV | 100 |
| 2016 | 2 | 141256 | 801 | ADG | 90 |
| 2016 | 1 | 141256 | 590 | IKM | 100 |
| 2016 | 1 | 141256 | 480 | EGM | 87 |
| 2015 | 2 | 141256 | 256 | YHX | 70 |
| 2015 | 1 | 141256 | 132 | QWE | 68 |
How do I get the tuples NOT in year = 2017 and NOT in cat_id = 2
I tried
SELECT
*
FROM
LIST_LEVEL
WHERE
YEAR <> '2017'
AND CAT_ID NOT IN (2)
But that query returns
| year | cat_id | user_id | id | val_1 | val_2 |
|------|--------|---------|------|-------|--------|
| 2016 | 1 | 141256 | 590 | IKM | 100 |
| 2016 | 1 | 141256 | 480 | EGM | 87 |
| 2015 | 1 | 141256 | 132 | QWE | 68 |
And I need this result set
| year | cat_id | user_id | id | val_1 | val_2 |
|------|--------|---------|------|-------|--------|
| 2017 | 1 | 141256 | 882 | TGV | 100 |
| 2016 | 2 | 141256 | 801 | ADG | 90 |
| 2016 | 1 | 141256 | 590 | IKM | 100 |
| 2016 | 1 | 141256 | 480 | EGM | 87 |
| 2015 | 2 | 141256 | 256 | YHX | 70 |
| 2015 | 1 | 141256 | 132 | QWE | 68 |
Finally, I ended whith this query but I think it is a bit complex.
SELECT
*
FROM (
SELECT
*
FROM
LIST_LEVEL
WHERE
YEAR <> '2017'
UNION
SELECT
*
FROM
LIST_LEVEL
WHERE
CAT_ID NOT IN (2)
) T
Is there any other way I can write this query?
I think you just want or:
SELECT ll.*
FROM LIST_LEVEL ll
WHERE ll.YEAR <> '2017' OR CAT_ID NOT IN (2) ;
Or, if you prefer:
WHERE NOT (ll.YEAR = '2017' AND CAT_ID IN (2) ) ;
SELECT
*
FROM
LIST_LEVEL
WHERE
YEAR <> '2017'
AND NOT EXISTS(SELECT * FROM
LIST_LEVEL
WHERE CAT_ID IN(2)

PDI Kettle - How to Normalize Advanced Structure?

I have 7 columns of data in a MySQL Database. The Year1 column belongs to the Revenue1 column. The following columns have the same structure. I know how to handle this in SQL, but not in PDI. Can anyone describe how to do it?
mySQL table structure
+--------+-------+-------+-------+----------+----------+----------+
| Ticker | Year1 | Year2 | Year3 | Revenue1 | Revenue2 | Revenue3 |
+--------+-------+-------+-------+----------+----------+----------+
| | | | | | | |
| ABC | 2010 | 2011 | 2012 | 250000 | 500000 | 1000000 |
+--------+-------+-------+-------+----------+----------+----------+
Desired normalized output from PDI:
+------------+------+-----------+---------+
| Ticker | Year | Keyfigure | Value |
+------------+------+-----------+---------+
| | | | |
| ABC | 2010 | Revenue | 250000 |
| | | | |
| ABC | 2011 | Revenue | 500000 |
| | | | |
| ABC | 2012 | Revenue | 1000000 |
+------------+------+-----------+---------+
Have you tried using the row denormaliser?

How can I do this in SQL in a Single Statement?

I have the following MySQL table:
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
| Version | Yr_Varient | FY | Period | CoA | Company | Item | Mvt | Ptnr_Co | Investee | GC | LC |
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
| 201 | 1 | 2010 | 1 | 11 | 23 | 1110105000 | 60200 | | | 450000 | 450000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2110300000 | 60200 | | | -520000 | -520000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 1220221600 | | | | 78080 | 78080 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2130323000 | | | | 50000 | 50000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 2130322000 | | | | -58080 | -58080 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3100505000 | | | | -275000 | -275000 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3200652500 | | | | 216920 | 216920 |
| 201 | 1 | 2010 | 1 | 11 | 23 | 3900000000 | | | | 58080 | 58080 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 1110105000 | 60200 | | | 376000 | 376000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2110300000 | 60200 | | | -545000 | -545000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 1220221600 | | | | 452250 | 452250 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2130323000 | | | | -165000 | -165000 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 2130322000 | | | | -118250 | -118250 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3100505000 | | | | -937750 | -937750 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3200652500 | | | | 819500 | 819500 |
| 201 | 1 | 2010 | 1 | 11 | 26 | 3900000000 | | | | 118250 | 118250 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 1110105000 | 60200 | | | 777000 | 777000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2110308000 | 60200 | 43 | | -255000 | -255000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2130321500 | | | | 180000 | 180000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2130322000 | | | | -77000 | -77000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 2310407001 | | 1 | | -625000 | -625000 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3100505000 | | | | -2502500 | -2502500 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3200652500 | | | | 2425500 | 2425500 |
| 201 | 1 | 2010 | 1 | 11 | 37 | 3900000000 | | | | 77000 | 77000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1110105000 | 60200 | | | 2600000 | 2600000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140161000 | 60200 | | 23 | 430000 | 430000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140161000 | 60200 | | 26 | 505556 | 505556 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1140160000 | 60200 | 37 | | 255000 | 255000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1160163000 | 60200 | 99999 | 48 | 49428895 | 49428895 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1160163000 | 60200 | 99999 | 49 | 188260175 | 188260175 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2310405500 | | | | -237689070 | -237689070 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2110300000 | 60200 | | | -1000 | -1000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2110300500 | 60200 | | | -3999000 | -3999000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 1220221600 | | | | 1571112 | 1571112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2130321500 | | | | -805556 | -805556 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 2130322000 | | | | -556112 | -556112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3100505000 | | | | -836000 | -836000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3200652500 | | | | 781000 | 781000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3300715700 | | 99999 | 32 | -440000 | -440000 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3300715700 | | 99999 | 26 | -61112 | -61112 |
| 201 | 1 | 2010 | 1 | 11 | 43 | 3900000000 | | | | 556112 | 556112 |
+---------+------------+------+--------+------+---------+------------+-------+---------+----------+------------+------------+
I need to take all rows with Mvt = 60200 and multiply every GC and LC record in that row by 1.1 and add a new row containing the changes back into the same table with FY set to 2011.
How can I do all this in 1 statement?
Is it even possible to do all this in 1 statement (I know very little about SQL)?
Can this be done in standard SQL as the database will be ported to another Database Server?
I don't know which server it will be.
In standard SQL (there may be better ways in vendor-specific implementations but I tend to prefer standard stuff where possible):
insert into mytable (
Version, Yr_Varient, Period, CoA, Company, Item, Mvt, Ptnr_Co, Investee,
FY, GC, LC
) select
Version, Yr_Varient, Period, CoA, Company, Item, Mvt, Ptnr_Co, Investee,
2011, GC*1.1, LC*1.1
from mytable
where Mvt = 60200
-- and FY = 2010
You may also want to limit your select statement a little more depending on the results of your testing, such as uncommenting the and FY = 2010 line above to stop copying all your 2009 and 2008 data as well, if any. I asume you only wanted to carry forward the previous year's stuff with a 10% increase on GC and LC.
The way this works is to run the select which gives modified data for FY, GC and LC as per your request, and pump all those rows back into the insert.
insert into mytable (
Version,Yr_Varient,FY,Period,CoA,Company,Item,Mvt,Ptnr_Co,Investee,GC,LC)
SELECT Version ,Yr_Varient,"2011" as FY, Period, CoA, Company , Item , Mvt ,Ptnr_Co , Investee , GC*1.1 as GC, LC*1.1 as LC FROM <table Name>
WHERE Mvt = 60200
INSERT INTO _table_
(Version,
Yr_Varient,
FY,
Period,
CoA,
Company,
Item,
Mvt,
Ptnr_Co,
Investee,
GC,
LC)
SELECT
Version,
Yr_Varient,
2011,
Period,
CoA,
Company,
Item,
Mvt,
Ptnr_Co,
Investee,
GC * 1.1,
LC * 1.1
FROM
_table_
WHERE
Mvt = 60200
AND FY <> 2011
This statement should work in any SQL-Database.
Edit: Too slow