Different value counts on same column - sql

I am new to Oracle. I have an Oracle table with three columns: serialno, item_category and item_status. In the third column the rows have values of serviceable, under_repair or condemned.
I want to run the query using count to show how many are serviceable, how many are under repair, how many are condemned against each item category.
I would like to run something like:
select item_category
, count(......) "total"
, count (.....) "serviceable"
, count(.....)"under_repair"
, count(....) "condemned"
from my_table
group by item_category ......
I am unable to run the inner query inside the count.
Here's what I'd like the result set to look like:
item_category total serviceable under repair condemned
============= ===== ============ ============ ===========
chair 18 10 5 3
table 12 6 3 3

You can either use CASE or DECODE statement inside the COUNT function.
SELECT item_category,
COUNT (*) total,
COUNT (DECODE (item_status, 'serviceable', 1)) AS serviceable,
COUNT (DECODE (item_status, 'under_repair', 1)) AS under_repair,
COUNT (DECODE (item_status, 'condemned', 1)) AS condemned
FROM mytable
GROUP BY item_category;
Output:
ITEM_CATEGORY TOTAL SERVICEABLE UNDER_REPAIR CONDEMNED
----------------------------------------------------------------
chair 5 1 2 2
table 5 3 1 1

This is a very basic "group by" query. If you search for that you will find plenty of documentation on how it is used.
For your specific case, you want:
select item_category, item_status, count(*)
from <your table>
group by item_category, item_status;
You'll get something like this:
item_category item_status count(*)
======================================
Chair under_repair 7
Chair condemned 16
Table under_repair 3
Change the column ordering as needed for your purpose

I have a tendency of writing this stuff up so when I forget how to do it, I have an easy to find example.
The PIVOT clause was new in 11g. Since that was 5+ years ago, I'm hoping you are using it.
Sample Data
create table t
(
serialno number(2,0),
item_category varchar2(30),
item_status varchar2(20)
);
insert into t ( serialno, item_category, item_status )
select
rownum serialno,
( case
when rownum <= 12 then 'table'
else 'chair'
end ) item_category,
( case
--table status
when rownum <= 12
and rownum <= 6
then 'servicable'
when rownum <= 12
and rownum between 7 and 9
then 'under_repair'
when rownum <= 12
and rownum > 9
then 'condemned'
--chair status
when rownum > 12
and rownum < 13 + 10
then 'servicable'
when rownum > 12
and rownum between 23 and 27
then 'under_repair'
when rownum > 12
and rownum > 27
then 'condemned'
end ) item_status
from
dual connect by level <= 30;
commit;
and the PIVOT query:
select *
from
(
select
item_status stat,
item_category,
item_status
from t
)
pivot
(
count( item_status )
for stat in ( 'servicable' as "servicable", 'under_repair' as "under_repair", 'condemned' as "condemned" )
);
ITEM_CATEGORY servicable under_repair condemned
------------- ---------- ------------ ----------
chair 10 5 3
table 6 3 3
I still prefer #Ramblin' Man's way of doing it (except using CASE in place of DECODE) though.
Edit
Just realized I left out the TOTAL column. I'm not sure there's a way to get that column using the PIVOT clause, perhaps someone else knows how. May also be the reason I don't use it that often.

Related

SQL Server - Find similarities in column and write them into new column

I have a big table with data like this:
ID Title
-- ------------------------
1 01_SOMESTRING_038
2 01_SOMESTRING K5038
3 01_SOMESTRING-648
4 K-OTHERSTRING_T_73474
5 K-OTHERSTRING_T_ffk
6 ABC
7 DEF
And the task is now to find similarities in that column, and write that found similarity to a new column.
So the desired output would be like this:
ID Title Similarity
-- ------------------------ -----------------
1 01_SOMESTRING_038 01_SOMESTRING
2 01_SOMESTRING K5038 01_SOMESTRING
3 01_SOMESTRING-648 01_SOMESTRING
4 K-OTHERSTRING_T_73474 K-OTHERSTRING_T_
5 K-OTHERSTRING_T_ffk K-OTHERSTRING_T_
6 ABC NULL
7 DEF NULL
How can I achieve that in MS SQL Server 17?
Any help is much appreciated. Thanks!
EDIT: The strings are not only broken by delimiters such as "-", "_".
And for handling competeing similrities I would set a minimum length for the similarity. For instance 10.
Try the following, using a recursive CTE to split out the letters, then we can group them up to find the greatest match:
WITH TITLE_EXPAND AS (
SELECT
1 MatchLen
,CAST(SUBSTRING(Title,1,1) as NVARCHAR(255)) MatchString
,Title
,ID
FROM
[SourceDataTable]
UNION ALL
SELECT
MatchLen + 1
,CAST(SUBSTRING(Title,1,MatchLen+1) AS NVARCHAR(255))
,Title
,ID
FROM
TITLE_EXPAND
WHERE
MatchLen < LEN(Title)
)
SELECT DISTINCT
SDT.ID
,SDT.title
,FIRST_VALUE(MatchString) OVER (PARTITION BY SDT.ID ORDER BY SC.MatchLen DESC, SC.MatchCount DESC) Similarity
FROM
[SourceDataTable] SDT
LEFT JOIN
(SELECT
*
,COUNT(*) OVER (PARTITION BY MatchString, MatchLen) MatchCount
FROM
TITLE_EXPAND) SC
ON
SDT.ID = SC.ID
AND
SC.MatchCount > 1
ORDER BY SDT.ID
Where SourceDataTable is your source table. The Similarity value will be the longest matched similar value.

Case in Sql group by query

I am working on a project in which I want to use Case to calculate price of product under specific Reference Number in SQL server. Below is my Sql query
SELECT
product AS Products,
refNum AS Refrence,
COUNT(id) AS Count
FROM ProductPriceList
GROUP BY
refNum, product
By Executing Above query I get:
Product Reference Count
Product1 Ref08 24
Product2 Ref08 7
Product3 Ref07 32
Product2 Ref12 1
Product3 Ref12 18
Product1 Ref07 76
Product1 Null 56
Can anyone guide me how to use Case statement in Sql query with group by statement to show price Below is the case:
if count < 10 then price 1
if count > 10 and < 100 then price 2
if count > 100 then price 3
I don't want to add a new table in my database. I hope you can understand my query.
Thanks in advance.
I think a basic CASE expression can handle your requirement:
SELECT
product AS Products,
refNum AS Refrence,
CASE WHEN COUNT(*) < 10 THEN 1
WHEN COUNT(*) >= 10 AND COUNT(*) < 100 THEN 2
ELSE 3 END AS price
FROM ProductPriceList
GROUP BY
product, refNum;
Not much to explain here, except that the 2 price case uses a bound which includes the count of 10 (since the 1 price case excludes it).
Here's alternative (doesn't differ much from exisiting one though):
You can use your query in subquery and use case outside:
select product,
--to get NULL values back
case Reference when 'RefNull' then NULL else Reference end [Reference],
case when [Count] < 10 then 1
when [Count] between 10 and 100 then 2
else 3 end [price]
from (
SELECT product AS Products,
--to allow also null values to be grouped
coalesce(refNum, 'RefNull') AS Refrence,
COUNT(id) AS Count
FROM ProductPriceList
GROUP BY coalesce(refNum, 'RefNull'), product
) [a]
Dataset:
Create Table ProductPriceList
(
Product varchar(10)
,RefNum CHAR(5)
,Records Int
);
Insert into ProductPriceList
Values
('Product1','Ref08',24)
,('Product2','Ref08',7)
,('Product3','Ref07',32)
,('Product2','Ref12',1)
,('Product3','Ref12',18)
,('Product1','Ref07',76)
,('Product1', NULL, 56);
With RCTE AS
(
Select Product
,RefNum
,Records
,1 RowNo
From ProductPriceList PPL
Union All
Select Product
,RefNum
,Records
,RowNo + 1
From RCTE R
Where RowNo + 1 < Records
)
Insert Into ProductPriceList (Product, RefNum, Records)
Select Product, RefNum, Records
From RCTE
where Records > 1
Query to fetch desired result:
Select Product
,RefNum
,Case When Count(*) < 10 Then 1
When Count(*) Between 10 and 99 then 2
Else 3 End Price
From ProductPriceList
Group By Product, RefNum
SQL Fiddle

SQL aggregate rows with same id , specific value in secondary column

I'm looking to filter out rows in the database (PostgreSQL) if one of the values in the status column occurs. The idea is to sum the amount column if the unique reference only has a status equals to 1. The query should not SELECT the reference at all if it has also a status of 2 or any other status for that matter. status refers to the state of the transaction.
Current data table:
reference | amount | status
1 100 1
2 120 1
2 -120 2
3 200 1
3 -200 2
4 450 1
Result:
amount | status
550 1
I've simplified the data example but I think it gives a good idea of what I'm looking for.
I'm unsuccessful in selecting only references that only have status 1.
I've tried sub-queries, using the HAVING clause and other methods without success.
Thanks
Here's a way using not exists to sum all rows where the status is 1 and other rows with the same reference and a non 1 status do not exist.
select sum(amount) from mytable t1
where status = 1
and not exists (
select 1 from mytable t2
where t2.reference = t1.reference
and t2.status <> 1
)
SELECT SUM(amount)
FROM table
WHERE reference NOT IN (
SELECT reference
FROM table
WHERE status<>1
)
The subquery SELECTs all references that must be excluded, then the main query sums everything except them
select sum (amount) as amount
from (
select sum(amount) as amount
from t
group by reference
having not bool_or(status <> 1)
) s;
amount
--------
550
You could use windowed functions to count occurences of status different than 1 per each group:
SELECT SUM(amount) AS amount
FROM (SELECT *,COUNT(*) FILTER(WHERE status<>1) OVER(PARTITION BY reference) cnt
FROM tc) AS sub
WHERE cnt = 0;
Rextester Demo

How do I aggregate numbers from a string column in SQL

I am dealing with a poorly designed database column which has values like this
ID cid Score
1 1 3 out of 3
2 1 1 out of 5
3 2 3 out of 6
4 3 7 out of 10
I want the aggregate sum and percentage of Score column grouped on cid like this
cid sum percentage
1 4 out of 8 50
2 3 out of 6 50
3 7 out of 10 70
How do I do this?
You can try this way :
select
t.cid
, cast(sum(s.a) as varchar(5)) +
' out of ' +
cast(sum(s.b) as varchar(5)) as sum
, ((cast(sum(s.a) as decimal))/sum(s.b))*100 as percentage
from MyTable t
inner join
(select
id
, cast(substring(score,0,2) as Int) a
, cast(substring(score,charindex('out of', score)+7,len(score)) as int) b
from MyTable
) s on s.id = t.id
group by t.cid
[SQLFiddle Demo]
Redesign the table, but on-the-fly as a CTE. Here's a solution that's not as short as you could make it, but that takes advantage of the handy SQL Server function PARSENAME. You may need to tweak the percentage calculation if you want to truncate rather than round, or if you want it to be a decimal value, not an int.
In this or most any solution, you have to count on the column values for Score to be in the very specific format you show. If you have the slightest doubt, you should run some other checks so you don't miss or misinterpret anything.
with
P(ID, cid, Score2Parse) as (
select
ID,
cid,
replace(Score,space(1),'.')
from scores
),
S(ID,cid,pts,tot) as (
select
ID,
cid,
cast(parsename(Score2Parse,4) as int),
cast(parsename(Score2Parse,1) as int)
from P
)
select
cid, cast(round(100e0*sum(pts)/sum(tot),0) as int) as percentage
from S
group by cid;

Selecting and sorting data from a single table

Correction to my question....
I'm trying to select and sort in a query from a single table. The primary key for the table is a combination of a serialized number and a time/date stamp.
The table's name in the database is "A12", the columns are defined as:
Serial2D (PK, char(25), not null)
Completed (PK, datetime, not null)
Result (smallint, null)
MachineID (FK, smallint, null)
PT_1 (float, null)
PT_2 (float, null)
PT_3 (float, null)
PT_4 (float, null)
Since the primary key for the table is a combination of the "Serial2D" and "Completed", there can be multiple "Serial2D" entries with different values in the "Completed" and "Result" columns. (I did not make this database... I have to work with what I got)
I want to write a query that will utilize the value of the "Result" column ( always a "0" or "1") and retrive only unique rows for each "Serial2D" value. If the "Result" column has a "1" for that row, I want to choose it over any entries with that Serial that has a "0" in the Result column. There should be only one entry in the table that has a Result column entry of "1" for any Serial2D value.
Ex. table
Serial2d Completed Result PT_1 PT_2 PT_3 PT_4
------- ------- ------ ---- ---- ---- ----
A1 1:00AM 0 32.5 20 26 29
A1 1:02AM 0 32.5 10 29 40
A1 1:03AM 1 10 5 4 3
B1 1:04AM 0 29 4 1 9
B1 1:05AM 0 40 3 4 9
C1 1:06AM 1 9 7 6 4
I would like to be able to retrieve would be:
Serial2d Completed Result PT_1 PT_2 PT_3 PT_4
------- ------- ------ ---- ---- ---- ----
A1 1:03AM 1 10 5 4 3
B1 1:05AM 0 40 3 4 9
C1 1:06AM 1 9 7 6 4
I'm new to SQL and I'm still learning ALL the syntax. I'm finding it difficult to search for the correct operators to use since I'm not sure what I need, so please forgive my ignorance. A post with my answer could be staring me right in the face and i wouldn't know it, please just point me to it.
I appreciate the answers to my previous post, but the answers weren't sufficient for me due to MY lack of information and ineptness with SQL. I know this is probably insanely easy for some, but try to remember when you first started SQL... that's where I'm at.
Since you are using SQL Server, you can use Windowing Functions to get this data.
Using a sub-query:
select *
from
(
select *,
row_number() over(partition by serial2d
order by result desc, completed desc) rn
from a12
) x
where rn = 1
See SQL Fiddle with Demo
Or you can use CTE for this query:
;with cte as
(
select *,
row_number() over(partition by serial2d
order by result desc, completed desc) rn
from a12
)
select *
from cte c
where rn = 1;
See SQL Fiddle With Demo
You can group by Serial to get the MAX of each Time.
SELECT Serial, MAX([Time]) AS [Time]
FROM myTable
GROUP BY Serial
HAVING MAX(Result) => 0
SELECT
t.Serial,
max_Result,
MAX([time]) AS max_time
FROM
myTable t inner join
(SELECT
Serial,
MAX([Result]) AS max_Result
FROM
myTable
GROUP BY
Serial) m on
t.serial = m.serial and
t.result = m.max_result
group by
t.serial,
max_Result
This can be solved using a correlated sub-query:
SELECT
T.serial,
T.[time],
0 AS result
FROM tablename T
WHERE
T.result = 1
OR
NOT EXISTS(
SELECT 1
FROM tablename
WHERE
serial = T.serial
AND (
[time] > T.[time]
OR
result = 1
)
)