Remove duplicates from query - sql

I have the below table
item
area
qty
item 1
a
10
item 1
b
17
item 2
b
20
item 3
a
10
item 2
c
8
I am looking to have a result in SQL as below (a unique item and a unique area):
item
area a
area b
area c
item 1
10
17
0
item 2
0
20
8
item 3
10
0
0
i do have this query which not giving me what am looking for if the area has been changed or increased also its for 2 columns table not 3 columns:
select
item,
max(case when seqnum = 1 then area end) as area_1,
max(case when seqnum = 2 then area end) as area_2,
max(case when seqnum = 3 then area end) as area_3
from (
select A.*,
row_number() over (partition by item order by area) as seqnum
from A
) A
group by item;
Looking forwards to your kind help.

If you have a fixed list of areas, then no need for window functions ; you can explicitly filter on each individual value in max().
Another fix to your query is to take the max of qty rather than of area (whose value is already filtered).
select item,
coalesce(max(case when area = 'a' then qty end), 0) as area_a,
coalesce(max(case when area = 'b' then qty end), 0) as area_b,
coalesce(max(case when area = 'c' then qty end), 0) as area_c
from mytable
group by item

Related

SQL view on table to combine rows into additional columns

This has stumped me a little, I have a table like this
Id
Address
Address 1
Postcode
1
1 straight street
4 corners
BL51 ANK
1
46 Double Close
Some Place
ZE12 7TB
2
7 The Fields
Farmland
FA7 5ME
I need to create a view that will produce this result:
Id
Address
Address 1
Postcode
Address
Address 1
Postcode
1
1 straight street
4 corners
BL51 ANK
46 Double Close
Some Place
ZE12 7TB
2
7 The Fields
Farmland
FA7 5ME
So basically based on the ID there can be between 4 and 50 rows, I need this returning as a single row with multiple columns containing the different data. I didn't want to do 50 joins as I'm sure there is a smarter way to do this.
Any help is much appreciated.
You can use conditional aggregation (or pivot). Simpler and faster than 50 joins, but still cumbersome:
select id,
max(case when seqnum = 1 then address end),
max(case when seqnum = 1 then address1 end),
max(case when seqnum = 1 then postcode end),
max(case when seqnum = 2 then address end),
max(case when seqnum = 2 then address1 end),
max(case when seqnum = 2 then postcode end),
. . .
max(case when seqnum = 50 then address end),
max(case when seqnum = 50 then address1 end),
max(case when seqnum = 50 then postcode end)
from (select t.*,
row_number() over (partition by id order by (select null)) as seqnum
from t
) t
group by id;
The code is pretty repetitive, so it is simple to generate it in a spreadsheet.

How to choose the first value then ignor rest?

I have a table with data as follows:
id activity amount
1 unknown 20
2 storage 20
3 storage 20
4 swift 20
5 delivery 50
6 storage 20
I want to create a query which gives me the "calculated" sum.
for the example above.. the desired result is:
id activity amount calculatedsum
1 unknown 20 0
2 storage 20 20
3 storage 20 20
4 swift 20 20
5 delivery 50 70 (had 20 and 50 arrived)
6 storage 20 70
the logic is simple..
find the first row which is 'storage', that is the calculatedsum. when encounter a row with 'delivery' add it and that is the new calculatedsum.
This is what I tried to do:
select *,
sum(case when activity = 'Storage' then amount
when activity = 'delivery' then + amount
else 0
end) over (order by id)
from A
however this doesn't work...
how can I get the expected result?
Edit: id is a colum which was created by: select row_number() over (order by .... nulls last) as id
the table contains the result from the query... and eveytime the query runs the table is reseted by it... so the id is always the actual row number.
If you are only counting the first 'storage', then you need to identify it. You can do that using row_number():
select *,
sum(case when activity = 'Storage' and seqnum = 1 then amount
when activity = 'delivery' then amount
else 0
end) over (order by id)
from (select a.*,
row_number() over (partition by activity order by id) as seqnum
from A a
) a
A totally weird way of doing this without a subquery, if we assuming that the amounts for storage are all the same:
select *,
(sum(case when activity = 'delivery' then amount
else 0
end) over (order by id) +
min(case when activity = 'storage' then amount else 0
end) over (order by id)
)
from A a;

How to find all items in table that don't have a certain text in column

I have a table iminvbin_sql.. This table has columns item_no, loc, bin_no. Each item number should have 4 bins in location 2. how do I find all the items that do not have these four bins in loc 2?
I tried
select item_no from iminvbin_sql where bin_no not in('910SHIP','910STAGE','910PROD','1') AND loc = 2
but that didn't work.
itemno loc bin
0 2 1
0 2 910PROD
0 2 910SHIP
0 2 910STAGE
I think this is what you want:
select item_no
from iminvbin_sql
where bin_no in ('910SHIP', '901STAGE', '910PROD', '1') and
loc = 2
group by item_no
having count(distinct bin_no) <> 4;
This will check that all four of those values are in the bins. If you want to verify that these four values are in the bins and no other values are, you can test for this as well:
select item_no
from iminvbin_sql
where loc = 2
group by item_no
having count(distinct bin_no) <> 4 or
count(distinct case when bin_no in ('910SHIP', '901STAGE', '910PROD', '1') then bin_no end) <> 4 or
count(*) <> 4;
EDIT:
In response to Bohemian's comment, the following should get all items that are not fully populated:
select item_no
from iminvbin_sql
group by item_no
having count(distinct case when loc = 2 then bin_no end) <> 4 or
count(distinct case when loc = 2 then bin_no in ('910SHIP', '901STAGE', '910PROD', '1') then bin_no end) <> 4 or
sum(case when loc = 2 then 1 else 0 end) <> 4;
Use a group by with a having clause to determine if all bins are not there:
select item_no
from iminvbin_sql
group by item_no
having sum(case when loc = 2 and bin_no in ('910SHIP','901STAGE','910PROD','1') then 1 end) < 4
The important point with this query is that by moving the condtions into the case, it will find items that have none any of the listed bins or even items that have no data for loc = 2.
SELECT
itemno
FROM
iminvbin_sql
WHERE
loc = 2
GROUP BY
itemno
HAVING
4 <> SUM(CASE WHEN bin IN ('910SHIP','901STAGE','910PROD','1') THEN 1 ELSE 0 END)

Exclude value of a record in a group if another is present

In the example table below, I'm trying to figure out a way to sum amount over id for all marks where mark 'C' doesn't exist within an id. When mark 'C' does exist in an id, I want the sum of amounts over that id, excluding the amount against mark 'A'. As illustration, my desired output is at the bottom. I've considered using partitions and the EXISTS command, but I'm having trouble conceptualizing the solution. If any of you could take a look and point me in the right direction, it would be greatly appreciated :)
sample table:
id mark amount
------------------
1 A 1
2 A 3
2 B 2
3 A 2
4 A 1
4 B 3
5 A 1
5 C 3
6 A 2
6 C 2
desired output:
id sum(amount)
-----------------
1 1
2 5
3 2
4 4
5 3
6 2
select
id,
case
when count(case mark when 'C' then 1 else null end) = 0
then
sum(amount)
else
sum(case when mark <> 'A' then amount else 0 end)
end
from sampletable
group by id
Here is my effort:
select id, sum(amount) from table t where not t.id = 'A' group by id
having id in (select id from table t where mark = 'C')
union
select id, sum(amount) from table t where t.id group by id
having id not in (select id from table t where mark = 'C')
SELECT
id,
sum(amount) AS sum_amount
FROM atable t
WHERE mark <> 'A'
OR NOT EXISTS (
SELECT *
FROM atable
WHERE id = t.id
AND mark = 'C'
)
GROUP BY
id
;

SQL Level the data conditionally

I have the following puzzle to solve (an urgent business assignment to be exact)
SQL SERVER 2008
I have a table of this form
ID Market SubMarket Value
1 1 1 3
2 1 2 6
3 1 3 2
4 2 23 1
5 2 24 9
I have specific MarketIDs and every MarketID has specific SubMarketIDs (maximum 5 - I know how may for each)
eg MarketID 1 has SubMarketIDs 1,2,3
MarketID 2 has SubMarketIDs 23,24 etc
and each SubMarketID has a variable value
I must transform my data in a fixed table of this type
MarketID SubMarketAvalue SubMarketBValue SubMarketCValue....SubMarketEValue
1 3 6 2 null
2 1 9 null null
SubMarketAValue must contain the value of the smaller SubMarketID
SubMarketBValue must contain the value of the next bigger SubMarketID
You did not specify the RDBMS, but you can use the following in SQL Server 2005+, Oracle and PostgreSQL:
select market,
max(case when rn = 1 then value end) as SubMarketAvalue,
max(case when rn = 2 then value end) as SubMarketBvalue,
max(case when rn = 3 then value end) as SubMarketCvalue,
max(case when rn = 4 then value end) as SubMarketDvalue,
max(case when rn = 5 then value end) as SubMarketEvalue
from
(
select id, market, submarket, value,
row_number() over(partition by market
order by market, submarket) rn
from yourtable
) x
group by market
see SQL Fiddle with Demo