I have some sales data that shows if a bill has been generated for a customer. The column labelled bill_generated returns 'Y' if a bill has been generated else its blank. I am trying to find the list of customers for whom atleast one bill has been generated. There could be multiple rows for each cust_id as shown below:
cust_id, bill_generated
001,NULL
001,Y
002,NULL
002,NULL
003,Y
Could anyone advice on this. I am using Redshift DB. Thanks..
Try below using group by and having cluse
select cust_id from tablename
group by cust_id
having sum(case when bill_generated is null then 0 else 1 end)=1
you can use co-related sub-query
select * from t
where exists (select 1 from t t1
where t1.bill_generated='Y' and t1.cust_id=t.cust_id
)
I am using SQL Server and wondering if it is possible to iterate through time series data until specific condition is met and based on that label my data in other table?
For example, let's say I have a table like this:
Id Date Some_kind_of_event
+--+----------+------------------
1 |2018-01-01|dsdf...
1 |2018-01-06|sdfs...
1 |2018-01-29|fsdfs...
2 |2018-05-10|sdfs...
2 |2018-05-11|fgdf...
2 |2018-05-12|asda...
3 |2018-02-15|sgsd...
3 |2018-02-16|rgw...
3 |2018-02-17|sgs...
3 |2018-02-28|sgs...
What I want to get, is to calculate for each key the difference between two adjacent events and find out if there exists difference > 10 days between these two adjacent events. In case yes, I want to stop iterating for that specific key and put label 'inactive', otherwise 'active' in my other table. After we finish with one key, we start with another.
So for example id = 1 would get label 'inactive' because there exists two dates which have difference bigger that 10 days. The final result would be like that:
Id Label
+--+----------+
1 |inactive
2 |active
3 |inactive
Any ideas how to do that? Is it possible to do it with SQL?
When working with a DBMS you need to get away from the idea of thinking iteratively. Instead you need to try and think in sets. "Instead of thinking about what you want to do to a row, think about what you want to do to a column."
If I understand correctly, is this what you're after?
CREATE TABLE SomeEvent (ID int, EventDate date, EventName varchar(10));
INSERT INTO SomeEvent
VALUES (1,'20180101','dsdf...'),
(1,'20180106','sdfs...'),
(1,'20180129','fsdfs..'),
(2,'20180510','sdfs...'),
(2,'20180511','fgdf...'),
(2,'20180512','asda...'),
(3,'20180215','sgsd...'),
(3,'20180216','rgw....'),
(3,'20180217','sgs....'),
(3,'20180228','sgs....');
GO
WITH Gaps AS(
SELECT *,
DATEDIFF(DAY,LAG(EventDate) OVER (PARTITION BY ID ORDER BY EventDate),EventDate) AS EventGap
FROM SomeEvent)
SELECT ID,
CASE WHEN MAX(EventGap) > 10 THEN 'inactive' ELSE 'active' END AS Label
FROM Gaps
GROUP BY ID
ORDER BY ID;
GO
DROP TABLE SomeEvent;
GO
This assumes you are using SQL Server 2012+, as it uses the LAG function, and SQL Server 2008 has less than 12 months of any kind of support.
Try this. Note, replace #MyTable with your actual table.
WITH Diffs AS (
SELECT
Id
,DATEDIFF(DAY,[Date],LEAD([Date],1,0) OVER (ORDER BY [Id], [Date])) Diff
FROM #MyTable)
SELECT
Id
,CASE WHEN MAX(Diff) > 10 THEN 'Inactive' ELSE 'Active' END
FROM Diffs
GROUP BY Id
Just to share another approach (without a CTE).
SELECT
ID
, CASE WHEN SUM(TotalDays) = (MAX(CNT) - 1) THEN 'Active' ELSE 'Inactive' END Label
FROM (
SELECT
ID
, EventDate
, CASE WHEN DATEDIFF(DAY, EventDate, LEAD(EventDate) OVER(PARTITION BY ID ORDER BY EventDate)) < 10 THEN 1 ELSE 0 END TotalDays
, COUNT(ID) OVER(PARTITION BY ID) CNT
FROM EventsTable
) D
GROUP BY ID
The method is counting how many records each ID has, and getting the TotalDays by date differences (in days) between the current the next date, if the difference is less than 10 days, then give me 1, else give me 0.
Then compare, if the total days equal the number of records that each ID has (minus one) would print Active, else Inactive.
This is just another approach that doesn't use CTE.
I don't exactly know how to title this question. But I am looking to create a stored procedure or procedures to create a new table with averages. I have 19 sites that I have collected survey data from. I want to count each column two but with two different conditions.
E.g.
SELECT COUNT(ColumnName)
FROM TableName
WHERE ColumnName = 3
SELECT COUNT(ColumnName)
FROM TableName
WHERE ColumnName = 4
From there I would like to add those two numbers together then divide by another count for another column in the table.
Basically I want to know how many surveys have the answer 3 and 4 then divide them by how many surveys were answered. Also keep in mind I want numbers based on each site.
Use group by:
select columnname
from tablename
where columnname in (3, 4)
group by columname;
You seems want :
select sum(case when col in (3,4) then 1 else 0 end) / count(*)
from table t
So I have gotten a bit closer to what I want trying to achieve but it is still not doing what I want it do. This is what I have come up with but I don't know how to get to divide by the sum. SELECT (SELECT COUNT() FROM Resident_Survey WHERE CanbealonewhenIwish = 3 and Village = 'WP' and Setting = 'LTC')+ (SELECT COUNT() FROM Resident_Survey WHERE CanbealonewhenIwish = 4 and Village = 'WP' and Setting = 'LTC')/ (SELECT COUNT(*) FROM Resident_Survey WHERE Village = 'WP' and Setting = 'LTC') AS ICanbealonewhenIwish
I figured it out. I was looking to create this query.
SELECT 100.0 *
COUNT(CASE WHEN Privacyisrespected IN (3,4)
THEN 1
ELSE NULL END) /
COUNT(*) AS Myprivacyisrespected
FROM Resident_Survey
WHERE Village = 'WP'
and Setting = 'LTC'
I have a situation where I have in one table record 'a' which have order number 0 and also record 'a' but with order number 1 - this is correct.
i also have record 'b' which has order number 1 and there is no row for record 'b' where order number = 0. - this is not correct.
I need to create a script which will find all records where order number = 1 but order number 0 doesn't exist. Can you guys help with this?
i cannot use simple:
SELECT DISTINCT record FROM tablename WHERE order_number <> 0
because it will give me also record 'a' which i don't want to have in results.
I was thinking about using Not Exists function but it always compares 2 tables where i have all records in one table.
Regards
Using Not Inin Where will eliminate 'a' and will give only 'b'
Try this:-
SELECT DISTINCT record FROM tablename WHERE order_number <> 0
and record not in (Select record from tablename WHERE order_number = 0);
hope this helps:-)
I am slowly finding out that the replace string does not work with Wildcards. What I have is approx 10 SKUs, these 10 SKUs each have approx 20 sub SKUs.
Example: _Example - Parent SKU
Example7bTL - Child SKU
-End result would be to turn all child SKUs into _Example so i can get a sum of units sold in a clean format.
what i currently have for reference.
use test
CREATE TABLE #test (
Test int,
BillQty char(300) )
select Quantity, REPLACE (SKU, '%EXAMP%', 'Example') As Sku
from test.dbo.tblSFCOrderTxn
drop table #Test
Raw Data Example---
Quantity SKU
210 EXAMPLE7BOTL-C
42 EXAMPLE4BOTL
30 EXAMPLE1BOTL
28 EXAMPLE12BOTL
100 EXAMPLE7BOTL
97 EXAMPLE4BOTL
29 EXAMPLE7BOTL-C
What I want it to be
Quantity SKU
536 _Example
I am using SQL Server 2012
Given your data, just use case:
select (case when left(sku, 1) = '_' then '_Example' else sku end)
Shouldn't change data just to do a sum(). I would go with:
SELELCT
CASE WHEN SKU LIKE '%EXAMP%' THEN '_Example' ELSE SKU END AS SKU,
SUM(Quantity) AS Quantity
FROM test.dbo.tblSFCOrderTxn
GROUP BY CASE WHEN SKU LIKE '%EXAMP%' THEN '_Example' ELSE SKU END
ORDER BY 1
If you really do want to change the data, it would be:
UPDATE test.dbo.tblSFCOrderTxn
SET SKU = '_Example'
WHERE SKU LIKE '%EXAMP%'
It would be helpful
SELECT SUM(Quantity),'_Example' AS SKU FROM test.dbo.tblSFCOrderTxn GROUP BY CASE SKU WHEN '1' THEN '1' ELSE '2' END