PostgreSQL: reuse Column Data In Different Column Of The Same Query - sql

I'm trying to create a SELECT query that does several calculated fields on one of two tables. I'm new to SQL (I've looked at several free online tutorials, so I have a general idea), but I think my goal is a little out of my skill range.
I have two tables:
TreeRecord with columns ID (serial), Site (chr)
Each ID represents an individual tree.
TreeHistory with columns ID (serial), TreeID (int), DBH (int)
DBH is tree diameter.
Currently I can create this:
| Site | Total tree count of site | Avg DBH of site |
I would like to have another column that can give the total count of trees over a particular size for each site. I can recreate this in a simple query, and my research on stack (SQL Select - Calculated Column if Value Exists in another Table) makes me feel that a nested SELECT is what I'm after but I can't get that to work. My current code is this:
SELECT
"TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"), 0) AS Average_DBH
FROM
"TreeRecord"
LEFT OUTER JOIN
"TreeHistory" ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY
"Site"
ORDER BY
"Site" ASC;
Any help on this would be most appreciated.
Thank you

Use count with the specific size condition.
SELECT "TreeRecord"."Site",
count("TreeRecord".*) AS Total_Count,
round(avg("TreeHistory"."DBH"),0) AS Average_DBH,
count(case when "TreeHistory"."DBH" > 10 then 1 end) as count_over_specific_size
^^--change this size accordingly
FROM "TreeRecord"
LEFT OUTER JOIN "TreeHistory"
ON "TreeRecord"."ID" = "TreeHistory"."TreeID"
GROUP BY "Site"
ORDER BY "Site" ASC;

Related

Transposing and summing the top 5 results in Teradata SQL Assistant

I have a query that I converted from Access and is currently working correctly in Teradata SQL Assistant. The data pulled is just a standard table full of all of the data I need.
What I am wondering is: Can something be added to this query that will essentially sum up all of the Exposure values and then only show the top 5 Divisions by greatest to smallest sum (of those Top 5). Also, transposing the data so that my Topics are the left most column.
Here is the working code, details omitted.
SELECT
A.AS_OF_DT
, B.DIVISION
, B.CLASS
, Sum(A.BALANCE/1000000) AS "Bal in MMs"
, Sum(A.EXPOSURE/1000000) AS "Exp in MMs"
, Sum(CASE WHEN A.STATUS = 'NACC' THEN (B.BALANCE/1000000) ELSE 0 END) AS "NPL Bal as MMs"
FROM DB.TABLE1 A LEFT JOIN DB.TABLE2 B ON A.NAICS = B.NAICS_CD
WHERE A.AS_OF_DT= '2017-03-31'
GROUP BY
A.AS_OF_DT,
B.DIVISION,
B.CLASS
ORDER BY SUM (A.EXPOSURE/1000000) DESC
Essentially I want the columns to be the following:
DIVISION|DATE|
Below DIVISION would only be the Top 5 DIVISIONS summarized by EXPOSURE (under DATE)
I can try and clarify if needed. Just let me know.
Thanks!
End result is to have a datapaste I can throw into Excel without the manual work of transposing the data in Excel along with writing formulas to rummage through the 1000's of results of the base query to find summarize the individual Divisions and then picking the top 5 each month.
Thanks!
Shill
To get the 5 top for each division, you can use QUALIFY.
Add this to the end of you query:
QUALIFY ROW_NUMBER() over (PARTITION BY AS_OF_DATE,DIVISION order by (SUM (A.EXPOSURE/1000000))
For your other questions, SQL Assistant isn't much of a presentation tool, it won't do what you are asking for.
If your query already work,
try replacing:
SELECT
By:
SELECT top 10
(line 1)

Oracle Data Gaps

im looking for a query to fill this condition:
That currently gives us the number of BACs at the entity (which is something we need). The database assigns the BAC IDs consecutively within each accounting entity. So we need to add one more field to the query showing the current highest BAC ID at the entity. And once we have that, just filter the results down to anyplace the number of records doesn't equal the highest ID.
My current query:
select accounting_entity_id, count(bac_id)
from dc.pl_bac_information
group by accounting_entity_id
having count(bac_id) > 1;
Use analytic functions for this:
select bi.*
from (select bi.*, max(bac_id) over (partition by accounting_entity_id) as max_bac_id
from dc.pl_bac_information bi
) bi
where bac_id = max_bac_id;
This assumes you are using Oracle.
SELECT ACCOUNTING_ENTITY_ID
FROM DC.PL_BAC_INFORMATION
HAVING COUNT(BAC_ID) > 1 AND COUNT(BAC_ID) != MAX(BAC_ID)
GROUP BY ACCOUNTING_ENTITY_ID;

Getting a unique value from an aggregated result set

I've got an aggregated query that checks if I have more than one record matching certain conditions.
SELECT RegardingObjectId, COUNT(*) FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN ('55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58')
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
That's nice, but then there is one field in AsyncOperationBase that will be unique. Say count(*) = 3, well, AsyncOperationBaseId in AsyncOperationBase will have 3 different values since AsyncOperationBase is the table's primary key.
To be honest, I would not even know what terms and expressions to Google to find a solution.
If anyone has a solution and also, is there any words to describe what I'm looking for ? Perhaps BI people are often faced with such a requirement or something...
I could do it with an SSRS report where the report would visually do the grouping then I could expand each grouped row to get the AsyncOperationBaseId value, but simply through SQL, I can't seem to find a way out...
Thanks.
select * from [CRM_MSCRM].[dbo].[AsyncOperationBase]
where RegardingObjectId in
(
SELECT RegardingObjectId
FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN
(
'55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58'
)
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
)

SQL query for selecting all items except given name and all rows with id of that given name

I apologize for the messy title. Please consider following tables:
CAR_MODEL : car_model_id, car_name
CAR_INVENTORY : car_model_id, car_location_name,
The user would pass in a car_location_name, and I would like to get a list of all car_name EXCLUDING rows with given car_location_name, and all cars with the id of that car_location_name.
Let me explain further.
For a join as such, let's assume that the user passes in "Germany." Then I would like to get a list excluding row #2 and #6, which have car_location_name of "Germany." I would also like to exclude any rows with the car_id of row with Germany. (In this case car_id of 2 and 6, so any row with car_id of 2 or 6 should be eliminated.)
In this case, since Germany has car_id of 2, I would like to get rid of the row with car_location_name of "Canada", since it also has car_id of 2.
The result should be:
What sql query (Can be sql server specific) can I use to achieve this?
I'm sorry if the explanation is confusing - please ask questions if you are having trouble understanding what I'm trying to say.
Simplest is probably to do the join to get the results as usual, and then just eliminate all car_model_ids that exist in Germany;
SELECT cm.car_model_id, ci.car_location_name, cm.car_name
FROM CAR_MODEL cm
JOIN CAR_INVENTORY ci
ON cm.car_model_id=ci.car_model_id
WHERE cm.car_model_id NOT IN (
SELECT car_model_id FROM CAR_INVENTORY WHERE car_location_name='Germany'
)
An SQLfiddle to test with.

SQL Output Question

Edited
I am running into an error and I know what is happening but I can't see what is causing it. Below is the sql code I am using. Basically I am getting the general results I want, however I am not accurately giving the query the correct 'where' clause.
If this is of any assistance. The count is coming out as this:
Total Tier
1 High
2 Low
There are 4 records in the Enrollment table. 3 are active, and 1 is not. Only 2 of the records should be displayed. 1 for High, and 1 for low. The second Low record that is in the total was flagged as 'inactive' on 12/30/2010 and reflagged again on 1/12/2011 so it should not be in the results. I changed the initial '<=' to '=' and the results stayed the same.
I need to exclude any record from Enrollments_Status_Change that where the "active_status" was changed to 0 before the date.
SELECT COUNT(dbo.Enrollments.Customer_ID) AS Total,
dbo.Phone_Tier.Tier
FROM dbo.Phone_Tier as p
JOIN dbo.Enrollments as eON p.Phone_Model = e.Phone_Model
WHERE (e.Customer_ID NOT IN
(Select Customer_ID
From dbo.Enrollment_Status_Change as Status
Where (Change_Date >'12/31/2010')))
GROUP BY dbo.Phone_Tier.Tier
Thanks for any assistance and I apologize for any confusion. This is my first time here and i'm trying to correct my etiquette on the fly.
If you don't want any of the fields from that table dbo.Enrollment_Status_Change, and you don't seem to use it in any way — why even include it in the JOINs? Just leave it out.
Plus: start using table aliases. This is very hard to read if you use the full table name in each JOIN condition and WHERE clause.
Your code should be:
SELECT
COUNT(e.Customer_ID) AS Total, p.Tier
FROM
dbo.Phone_Tier p
INNER JOIN
dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE
e.Active_Status = 1
AND EXISTS (SELECT DISTINCT Customer_ID
FROM dbo.Enrollment_Status_Change AS Status
WHERE (Change_Date <= '12/31/2010'))
GROUP BY
p.Tier
Also: most likely, your EXISTS check is wrong — since you didn't post your table structures, I can only guess — but my guess would be:
AND EXISTS (SELECT * FROM dbo.Enrollment_Status_Change
WHERE Change_Date <= '12/31/2010' AND CustomerID = e.CustomerID)
Check for existence of any entries in dbo.Enrollment_Status_Change for the customer defined by e.CustomerID, with a Change_Date before that cut-off date. Right?
Assuming you want to:
exclude all customers whose latest enrollment_status_change record was since the start of 2011
but
include all customers whose latest enrollment_status_change record was earlier than the end of 2010 (why else would you have put that EXISTS clause in?)
Then this should do it:
SELECT COUNT(e.Customer_ID) AS Total,
p.Tier
FROM dbo.Phone_Tier p
JOIN dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE dbo.Enrollments.Active_Status = 1
AND e.Customer_ID NOT IN (
SELECT Customer_ID
FROM dbo.Enrollment_Status_Change status
WHERE (Change_Date >= '2011-01-01')
)
GROUP BY p.Tier
Basically, the problem with your code is that joining a one-to-many table will always increase the row count. If you wanted to exclude all the records that had a matching row in the other table this would be fine -- you could just use a LEFT JOIN and then set a WHERE clause like Customer_ID IS NULL.
But because you want to exclude a subset of the enrollment_status_change table, you must use a subquery.
Your intention is not clear from the example given, but if you wanted to exclude anyone who's enrollment_status_change as before 2011, but include those who's status change was since 2011, you'd just swap the date comparator for <.
Is this any help?