ORACLE SQL - Categorizing records based on logic - sql

What is the best way categorize records based on logic?
For example, from this table:
ID House Farm
1 (null) (null)
I would like to output:
ID Missing
1 House
1 Farm
Aside from the obvious UNION all below, is there a better way? Maybe a case when? UNION ALL will not be easily flexible when dealing with a bigger number of conditions.
select ID, 'House' as Missing from table where house is null
union all
select ID, 'Farm' as Missing from table where farm is null

While I don't know if it's more efficient than UNION ALL, another option is to use UNPIVOT depending on the version of Oracle you are using:
SELECT ID, Missing
FROM (
SELECT *
FROM YourTable
UNPIVOT INCLUDE NULLS (IsMissing FOR Missing IN (House as 'House', Farm as 'Farm'))
) t
WHERE IsMissing IS NULL
And here is the SQL Fiddle.

Please check the result using UNPIVOT. Check the links pivot and unpivot queries in 11g, PIVOT and UNPIVOT Operators in Oracle Database 11g Release 1 for more details.
SELECT
ID, MISSING
FROM
(
SELECT ID, NVL(HOUSE, 1) HOUSE, NVL(FARM, 1) FARM FROM YourTable
)x
UNPIVOT (
DCol
FOR MISSING
IN (HOUSE, FARM)
);
or
SELECT
ID, MISSING
FROM YourTable
UNPIVOT INCLUDE NULLS (
DCol
FOR MISSING
IN (HOUSE, FARM)
);

Related

SQL count(distinct) from both the table

I have 2 tables. Let's say Table A and Table B. Table A has a column called "name". Table B also has a column "name". I want to find out the count(distinct name). Name should take values from both the columns.
For ex-
Table A
name
A
B
C
Table B
name
A
B
D
Output should be 4.
The best concept is, first combine the data in the way you want using a subquery, and then dedupe or do the 2nd step.
For example,
WITH COMBINED AS (
SELECT
name
FROM
TableA
UNION ALL
SELECT
name
FROM
TableB
)
SELECT
DISTINCT name
FROM
COMBINED
In your situation, the 2nd step can be accomplished by changing UNION ALL to a UNION. This will dedupe the values automatically. You won't even need a subquery or a 2nd step. But I wanted to teach you the concept because it comes up often.
SELECT name FROM TableA
UNION
SELECT name FROM TableB
Then UNION in the CTE will reove all Duplicates
so a COUNT(*) will suffoce
WITH CTE AS (
SELECT name FROM TableA
UNION
SELECT name FROM TableB
)
SELECT COUNT(*) FROM CTE
I hope this query should do it:
SELECT SUM(names) AS total_names
FROM (
SELECT COUNT(DISTINCT(name)) as names FROM TableA
UNION
SELECT COUNT(DISTINCT(name)) as names FROM TableB
) t;
Note: Tested with sql server
Yet another option:
select hll_count.merge(hll_sketch) names
from (
select hll_count.init(name) hll_sketch from tableA
union all
select hll_count.init(name) from tableB
)
HLL++ functions are approximate aggregate functions. Approximate aggregation typically requires less memory than exact aggregation functions, like COUNT(DISTINCT), but also introduces statistical error. This makes HLL++ functions appropriate for large data streams for which linear memory usage is impractical, as well as for data that is already approximate.
See more about benefits of using HyperLogLog++ functions

Convert row values as columns in SQL Server

Table:
CompanyID Lead LeadManager
------------------------------
1 2 3
Required output:
CompanyID Role RoleID
--------------------------------
1 Lead 2
1 Leadmanager 3
You can use union all to unpivot your dataset. This is a standard solution that works across most (if not all) RDBMS:
select companyID, 'Lead' role, Lead from mytable
union all select companyID, 'LeadManager', LeadManager from mytable
You can use apply to unpivot the data:
select v.*
from t cross apply
(values (t.CompanyId, 'Lead', t.Lead),
(t.CompanyId, 'LeadManager', t.LeadManager)
) v(CompanyId, Role, RoleId);
The advantage to this approach is that it scans the original table only once. This can be particular helpful when the "table" is a complex query.

SQL combine 2 queries to one where 2 queries are from different database

I'm trying to combine two query results to one where both the tables are present in different databases like below:
select
(select COUNT(DISTINCT BaseVehicleID) as BVOld
from BaseVehicle) Old,
(select COUNT(DISTINCT BaseVehicleID) as BVNew
from [EnhancedStandard_VCDB_Exported_PRD_3006].BaseVehicle) New
Here [EnhancedStandard_VCDB_Exported_PRD_3006] is a different database.
So that I need to validate the count of records in both the database.
I'm able to combine the records among queries from same database.
Can someone please tell how to combine the result from 2 queries from 2 database.
Are you looking for 3-part naming? If so, this will probably work:
select (select COUNT(DISTINCT BaseVehicleID)
from BaseVehicle
) as Old,
(Select COUNT(DISTINCT BaseVehicleID)
from [EnhancedStandard_VCDB_Exported_PRD_3006].dbo.BaseVehicle
) New
You can use UNION ALL to combine the result of both the queries together in one result set.
Considering you're referring the database hosted on same SQL Server instance, If not you need to refer the the table on remote server using a Linked Server, like LinkedServerName.DatabasName.SchemaName.TableName.
If you've both the databases on same server you can use following query, alert, I'm considering your table is under default schema i.e. dbo.
Select COUNT(DISTINCT BaseVehicleID) as BVOldCount
from BaseVehicle
UNION ALL
Select COUNT(DISTINCT BaseVehicleID) as BVNewCount
from [EnhancedStandard_VCDB_Exported_PRD_3006].dbo.BaseVehicle;
Or
Select COUNT(DISTINCT BaseVehicleID) as BVOldCount, 'BVOldCount' as Type
from BaseVehicle
UNION ALL
Select COUNT(DISTINCT BaseVehicleID) as BVNewCount, 'BVNewCount' as Type
from [EnhancedStandard_VCDB_Exported_PRD_3006].dbo.BaseVehicle;
Try This:
SELECT COUNT(DISTINCT Base.BaseVehicleID) AS BVNew ,
Old.BVOld
FROM [EnhancedStandard_VCDB_Exported_PRD_3006].BaseVehicle AS Base
CROSS APPLY ( SELECT COUNT(DISTINCT B2.BaseVehicleID) AS BVOld
FROM BaseVehicle AS B2
) Old
GROUP BY Old.BVOld
If your other database is in other server, you need to create linked server and follow below query:
SELECT (SELECT count(*) FROM [serverName].[DatabaseName].dbo.TableName)
+
(SELECT count(*) FROM [serverName].[DatabaseName].dbo.TableName)

Oracle SQL to get Unique Records

Does anyone know the sql to pull 4 rows from the following table which contains 8 rows?
Just want one row for each arbitrary person.
The real data will be thousands of records so it must be generic and use only the ID's not the names.
table
You seem to have a symmetric relationship. So, you can do:
select t.*
from t
where t.id < t.pid;
select
ID,
FName,
LName
from your_table
union
select
PID,
PFName,
PLName
from your_table
order by 3, 2, 1

Oracle SQL -- What's wrong with this grouping?

I am trying to grab a row that has the max of some column. Normally I'd use Rank for this and just select rank = 1 but that seems pointless when I know I just need the max of a column. Here is my SQL:
SELECT
name,
value,
MAX(version)
FROM
my_table t
WHERE
person_type = "STUDENT"
GROUP by NAME,VALUE
HAVING version = max(version)
This returns the "You've done something wrong involving grouping error" i.e. "not a GROUP BY expression" when trying to run. If I add version to the group by field, this SQL runs, but it obviously returns all rows instead of just the max version of each.
So my question is mostly "Why doesn't this work?" I am selecting the max of version so I don't see why I need to group by it. I know there are other solutions (partition over, rank ...) but I am more interested in why this in particular is flawed syntactically.
EDIT: More explicit about the use of this having clause.
Let's say there are these two rows in table t:
NAME VALUE VERSION
JEREMY C 1
JEREMY A 2
What is returned from this query should be:
JEREMY A 2
But if I remove having then I would get:
JEREMY A 2
JEREMY C 2
The HAVING clause, in general, needs to contain columns that are produced by the group by. In fact, you can think of the HAVING clause as a WHERE on the group by.
That is, the query:
select <whatever>
from t
group by <whatever>
having <some condition>
is equivalent to:
select <whatever>
from (select <whatever>
from t
group by <whatever
) t
where <some condition>
If you think about it this way, you'll realize that max(version) makes sense because it is an aggregated value. However, "version" does not make sense, since it is neither a calculated value nor a group by column.
You seem to know how to fix this. The one other comment is that some databases (notably mysql) would accept your syntax. They treat "HAVING version = max(version)" as "HAVING any(version) = max(version)".
This SQL statement fails because the HAVING clause runs after the GROUP BY-- it can only operate on either aggregates or columns that are listed in the GROUP BY clause. If you have only grouped by NAME and VALUE, VERSION alone has no meaning-- it has many possible values for every combination of NAME and VALUE at that point so it doesn't make sense to compare it to MAX(version) or any other aggregate which has exactly 1 value for every NAME and VALUE pair.
You're trying to use version in your HAVING clause, but it's not being grouped by.
If all you want is the name, value and max version, you don't need the HAVING clause at all.
SELECT
name,
value,
MAX(version)
FROM
my_table t
WHERE
person_type = "STUDENT"
GROUP by NAME,VALUE
The HAVING clause is for when you want to have a "Where" clause after aggregation, like
HAVING max(version) > 5
EDIT:
Based on your sample data, you're grouping by VALUE but what you really want to do is identify the VALUE that has the MAX(VERSION) for each NAME.
To do this, you need to use a WHERE EXISTS or self join, like so:
select name, value, version from t
where exists
(
select 1 from
(select name, max(version) version
from t
group by name) s
where s.name = t.name and s.version = t.version
)
Another way of getting what you want:
select *
from (select name
, value
, version
, max(version) over
(partition by name) as max_version
from t)
where version = max_version;
Sample execution:
SQL> create table t (name varchar2(30)
2 , value varchar2(1)
3 , version number not null
4 , constraint t_pk primary key (name, version));
Table created.
SQL> insert into t select 'JEREMY', 'C', 1 from dual
2 union all select 'JEREMY', 'A', 2 from dual
3 union all select 'SARAH', 'D', 2 from dual
4 union all select 'SARAH', 'X', 1 from dual;
4 rows created.
SQL> commit;
Commit complete.
SQL> select name, value, version
2 from (select name
3 , value
4 , version
5 , max(version) over
6 (partition by name) as max_version
7 from t)
8 where version = max_version;
NAME V VERSION
------------------------------ - ----------
JEREMY A 2
SARAH D 2