I have a large data sets of truck tickets which generate two lines of output per ticket. This is because the ticket has an "out" and "in" component to each ticket. I want to generate one line of output but include information from both the "out" portion of the ticket and the "in" portion.
SELECT Ticket_number, Oil_volume,Faciliy_ID,Ticket_Type
FROM Truckticket T
JOIN TBATTERY TB
ON TB.Battery_ID = T.Battery_ID
My Output has two lines:
Ticket_number
Oil_volume
Facility_ID
Ticket_type
1
10
SK01
O
1
10
SK02
I
Now, what I want my output to be when I use a where clause on Facility_ID SK01:
Ticket_number
Oil_volume
Facility_ID
Facility_ID
Ticket_type
1
10
SK01
SK02
O
I know I have to do this with a subquery or CTE to get Facility_ID SK02 on the same line, but I'm stuck. I hope I have presented my question ok, my first time. Thanks!`
Instead of a subquery or CTE, this looks like a good time to use aggregation and pivoting. With a small number of static types, a simple MAX(CASE... expression with a GROUP BY can pivot the rows to columns.
SELECT
Ticket_number,
MAX(CASE WHEN Ticket_Type = 'O' then Oil_volume else NULL END) Oil_volume_out,
MAX(CASE WHEN Ticket_Type = 'I' then Oil_volume else NULL END) Oil_volume_in,
MAX(CASE WHEN Ticket_Type = 'O' then Facility_ID else NULL END) Facility_ID_out,
MAX(CASE WHEN Ticket_Type = 'I' then Facility_ID else NULL END) Facility_ID_in,
FROM Truckticket T
JOIN TBATTERY TB
ON TB.Battery_ID = T.Battery_ID
GROUP BY Ticket_number;
Related
This question already has answers here:
TSQL Pivot without aggregate function
(9 answers)
Closed 1 year ago.
I'm relatively new to coding and SQL so please bear with me.
I'm currently working on a query and I have no idea how to get the infinite loop to stop without using a case statement. When I use the case statement I get each value on its own row rather than the values all together in the combination they're supposed to be in.
Case statement SQL
select
CASE
When Attribute_id = '5024923' Then attribute_value
END Page_Name,
CASE
When Attribute_id = '5024925' Then attribute_value
END Site_Name,
CASE
When Attribute_id = '5024924' Then attribute_value
END Last_Touch_Channel,
count(distinct MASTER_CONTACT_ID) known_contact_count,
count (distinct visitor_id) total_contact_Count,
ACTION_DATE
from Adobe_Analytics_Staging
where ATTRIBUTE_ID in ('5024925','5024924','5024923')
group by ATTRIBUTE_ID, ACTION_DATE, ATTRIBUTE_VALUE
Example:
Error with Case statement:
Column A
Column B
Column C
value1
NULL
NULL
NULL
value2
NULL
NULL
NULL
value3
When in the data it is value1, value2, value3 on the same row.
So I'm trying a new avenue. I suspect the loop is because I'm linking back to the table so many times but I have limited the amount of results to the best of my ability to reduce the amount of records being sent through. Each query works and works fast individually. It's collectively that it slows down a ton.
The reason for joining to the table so many times is because I have to distinguish different types of values within one column.
Note: Not sure if it's relevant but the different values in the table correlate to a specific id number within that that table. Attribute value and attribute ID are different columns
For example in Table A the column looks like this
Column
A
B
C
I have to make it look like this:
Column 1
Column 2
Column 3
A
B
C
select
a.ATTRIBUTE_VALUE,
b.ATTRIBUTE_VALUE,
c.ATTRIBUTE_VALUE,
count(distinct aas.MASTER_CONTACT_ID) known_contact_count,
count (distinct d.visitor_id) total_contact_Count,
aas.ACTION_DATE
from Adobe_Analytics_Staging aas
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024923') a on a.VISITOR_ID = aas.VISITOR_ID
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024925') b on b.VISITOR_ID = aas.VISITOR_ID
left join (select ATTRIBUTE_VALUE, VISITOR_ID from Adobe_Analytics_Staging
where Attribute_id = '5024924') c on c.VISITOR_ID = aas.VISITOR_ID
inner join (select visitor_id from Adobe_Analytics_Staging
where ATTRIBUTE_ID in ('5024923','5024925','5024924')) d
on d.VISITOR_ID = aas.VISITOR_ID
--where aas.VISITOR_ID = '3438634761938550664_6795123974460253552'
group by a.ATTRIBUTE_VALUE, b.ATTRIBUTE_VALUE, c.ATTRIBUTE_VALUE, aas.ACTION_DATE
SELECT
VISITOR_ID,
MAX(CASE WHEN Attribute_id = '5024923' Then attribute_value END) Page_Name,
MAX(CASE WHEN Attribute_id = '5024925' Then attribute_value END) Site_Name,
MAX(CASE WHEN Attribute_id = '5024924' Then attribute_value END) Last_Touch_Channel,
COUNT(distinct MASTER_CONTACT_ID) known_contact_count,
COUNT(distinct visitor_id) total_contact_Count,
ACTION_DATE
FROM ContactTargeting.dbo.Adobe_Analytics_Staging
GROUP BY VISITOR_ID, ACTION_DATE
See this fiddle with some demo data
I have been building up a query today and I have got stuck. I have two unique Ids that identify if and order is Internal or Web. I have been able to split this out so it does the count of how many times they appear but unfortunately it is not providing me with the intended result. From research I have tried creating a Count Distinct Case When statement to provide me with the results.
Please see below where I have broken down what it is doing and how I expect it to be.
Original data looks like:
Company Name Order Date Order Items Orders Value REF
-------------------------------------------------------------------------------
CompanyA 03/01/2019 Item1 Order1 170 INT1
CompanyA 03/01/2019 Item2 Order1 0 INT1
CompanyA 03/01/2019 Item3 Order2 160 WEB2
CompanyA 03/01/2019 Item4 Order2 0 WEB2
How I expect it to be:
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 1 1
What currently comes out
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 2 2
As you can see from my current result it is counting every line even though it is the same reference. Now it is not a hard and fast rule that it is always doubled up. This is why I think I need a Count Distinct Case When. Below is my query I am currently using. This pull from a Progress V10 ODBC that I connect through Excel. Unfortunately I do not have SSMS and Microsoft Query is just useless.
My Current SQL:
SELECT
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
, Count(DISTINCT SopOrder_0.SooOrderNumber) AS 'Orders'
, SUM(CASE WHEN SopOrder_0.SooOrderNumber IS NOT NULL THEN 1 ELSE 0 END) AS 'Order Items'
, SUM(SopOrderItem_0.SoiValue) AS 'Order Value'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN 1 ELSE 0 END) AS 'INT'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'WEB%' THEN 1 ELSE 0 END) AS 'WEB'
FROM
SBS.PUB.Company Company_0
, SBS.PUB.SopOrder SopOrder_0
, SBS.PUB.SopOrderItem SopOrderItem_0
WHERE
SopOrder_0.SopOrderID = SopOrderItem_0.SopOrderID
AND Company_0.CompanyID = SopOrder_0.CompanyID
AND SopOrder_0.SooOrderDate > '2019-01-01'
GROUP BY
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
I have tried using the following line but it errors on me when importing:
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE 0 END) AS 'INT'
Just so know the error I get when importing at the moment is syntax error at or about "CASE WHEN sopOrder_0.SooParentOrderRefer" (10713)
Try removing the ELSE:
COUNT(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference END) AS num_int
You don't specify the error, but the problem is probably that the THEN is returning a string and the ELSE a number -- so there is an attempt to convert the string values to a number.
Also, learn to use proper, explicit, standard JOIN syntax. Simple rule: Never use commas in the FROM clause.
count distinct on the SooOrderNumber or the SooParentOrderReference, whichever makes more sense for you.
If you are COUNTing, you need to make NULL the thing that your are not counting. I prefer to include an else in the case because it is more consistent and complete.
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE null END) AS 'INT'
Gordon Linoff is correct regarding the source of your error, i.e. datatype mismatch between the case then value else value end. null removes (should remove) this ambiguity - I'd need to double check.
Editing my earlier answer...
Even though it looks, as you say, like count distinct is not supported in Pervasive PSQL, CTEs are supported. So you can do something like...
This is what you are trying to do but it is not supported...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
select id
,count(distinct col1) as col_count
from dups
group by id;
Stick another CTE in the query to de-duplicate the data first. Then count as normal. That should work...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
,de_dup as
(
select id
,col1
from dups
group by id
,col1
)
select id
,count(col1) as col_count
from de_dup
group by id;
These 2 versions should give the same result set.
There is always a way!!
I cannot explain the error you are getting. You are mistakenly using single quotes for alias names, but I don't actually think this is causing the error.
Anyway, I suggest you aggregate your order items per order first and only join then:
SELECT
c.coacompanyname
, so.sooorderdate
, COUNT(*) AS orders
, SUM(soi.itemcount) AS order_items
, SUM(soi.ordervalue) AS order_value
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'INT%' THEN 1 END) AS int
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'WEB%' THEN 1 END) AS web
FROM sbs.pub.company c
JOIN sbs.pub.soporder so ON so.companyid = c.companyid
JOIN
(
SELECT soporderid, COUNT(*) AS itemcount, SUM(soivalue) AS ordervalue
FROM sbs.pub.soporderitem
GROUP BY soporderid
) soi ON soi.soporderid = so.soporderid
GROUP BY c.coacompanyname, so.sooorderdate
ORDER BY c.coacompanyname, so.sooorderdate;
Recently I have very specific problem with data we get from our data-warehouse. The problem is being solved, but I have to edit our control environment for a while.
We have data about received invoices, however due to some reason, information about every invoice is split into two rows: First row has important columns unique_code_A, vendor_number, and the second row has important columns unique_code_B, amount. So every invoice has very specific unique code, and with this code I have to somehow join the information from both rows, as you can see in picture.
Well, you can use aggregation:
select date_key, invoice_type,
max(case when unique_code_b is null then unique_code_a end) as unique_code_a,
max(unique_code_b) as unique_code_b,
max(case when unique_code_b is null then vendor_number end) as vendor_number,
max(case when unique_code_b is not null then amount end) as amount
from t
group by date_key, invoice_type;
EDIT:
If the unique codes can be used for matching, then I would suggest:
select date_key, invoice_type,
coalesce(unique_code_a, unique_code_b) as unique_code,
max(case when unique_code_b is null then vendor_number end) as vendor_number,
max(case when unique_code_b is not null then amount end) as amount
from t
group by date_key, invoice_type, coalesce(unique_code_a, unique_code_b);
From what you told, a self join should probably work:
SELECT
A.DATE_KEY,
A.INVOICE_TYPE,
A.UNIQUE_CODE_A,
B.UNIQUE_CODE_B,
A.VENDOR_NUMBER,
B.AMOUNT
FROM MyTable A
INNER JOIN MyTable B ON A.UNIQUE_CODE_A=B.UNIQUE_CODE_B
I have a sql code which I am using to do pivot. Code is as follows:
SELECT DISTINCT PersonID
,MAX(pivotColumn1)
,MAX(pivotColumn2) --originally these were in 2 separate rows)
FROM(SELECT srcID, PersonID, detailCode, detailValue) FROM src) AS SrcTbl
PIVOT(MAX(detailValue) FOR detailCode IN ([pivotColumn1],[pivotColumn2])) pvt
GROUP BY PersonID
In the source data the ID has 2 separate rows due to having its own ID which separates the values. I have now pivoted it and its still giving me 2 separate rows for the ID even though i grouped it and used aggregation on the pivot columns. Ay idea whats wrong with the code?
So I have all my possible detailCode listed in the IN clause. So I have null returned when the value is none but I want it all summarised in 1 row. See image below.
If those are all the options of detailCode , you can use conditional aggregation with CASE EXPRESSION instead of Pivot:
SELECT t.personID,
MAX(CASE WHEN t.detailCode = 'cas' then t.detailValue END) as cas,
MAX(CASE WHEN t.detailCode = 'buy' then t.detailValue END) as buy,
MAX(CASE WHEN t.detailCode = 'sel' then t.detailValue END) as sel,
MAX(CASE WHEN t.detailCode = 'pla' then t.detailValue END) as pla
FROM YourTable t
GROUP BY t.personID
So I'm trying to construct a query in BigQuery that I'm struggling with for a final part.
As of now I have:
SELECT
UNIQUE(Name) as SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) as RevenueGenerated
FROM (
SELECT
mantaSubscriptionIdmetadata,
planIdmetadata,
INTEGER(Amount) as RevenueGenerated
FROM
[sample_internal_data.charge0209]
WHERE
revenueSourcemetadata = 'new'
AND
Status = 'Paid'
GROUP BY
mantaSubscriptionIdmetadata,
planIdmetadata,
RevenueGenerated
)a
JOIN (
SELECT
id,
Name,
Interval
FROM
[sample_internal_data.subplans]
WHERE
id in ('150017','150030','150033','150019')
GROUP BY
id,
Name,
Interval )b
ON
a.planIdmetadata = b.id
GROUP BY
ID,
Interval,
Name
ORDER BY
Interval ASC
The resulting query looks like this
Which is exactly what I'm looking for up to that point.
Now what I'm stuck on this. There is another column I need to add called SalesRepName. The resulting field will either be null or not null. If its null it means it was sold online. If its not null, it means it was sold via telephone. What I want to do is create two additional columns where it says how many were sold via telesales and via online. The sum total of the two columns will always equal the SubsPurchased total.
Can anyone help?
You can include case statements within aggregate functions. Here you could choose sum(case when SalesRepName is null then 1 else 0 end) as online and sum(case when SalesRepName is not null then 1 else 0 end) as telesales.
count(case when SalesRepName is null then 1 end) as online would give the same result. Using sum in these situations is simply my personal preference.
Note that omitting the else clause is equivalent to setting else null, and null isn't counted by count. This can be very useful in combination with exact_count_distinct, which has no equivalent in terms of sum.
Try below:
it assumes your SalesRepName field is in [sample_internal_data.charge0209] table
and then it uses "tiny version" of SUM(CASE ... WHEN ...) which works when you need 0 or 1 as a result to be SUM'ed
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telsales
SELECT
UNIQUE(Name) AS SubscriptionName,
ID,
Interval,
COUNT(mantaSubscriptionIdmetadata) AS SubsPurchased,
SUM(RevenueGenerated) AS RevenueGenerated,
SUM(SalesRepName IS NULL) AS onlinesales,
SUM(NOT SalesRepName IS NULL) AS telesales
FROM (
SELECT SalesRepName, mantaSubscriptionIdmetadata, planIdmetadata, INTEGER(Amount) AS RevenueGenerated
FROM [sample_internal_data.charge0209]
WHERE revenueSourcemetadata = 'new'
AND Status = 'Paid'
GROUP BY mantaSubscriptionIdmetadata, planIdmetadata, RevenueGenerated
)a
JOIN (
SELECT id, Name, Interval
FROM [sample_internal_data.subplans]
WHERE id IN ('150017','150030','150033','150019')
GROUP BY id, Name, Interval
)b
ON a.planIdmetadata = b.id
GROUP BY ID, Interval, Name
ORDER BY Interval ASC