Here is a sample table:
| customer_token | created_date | orders | views |
+--------------------------------------+------------------------------+--------+-------+
| 93a03e36-83a0-494b-bd68-495f54f406ca | 10-NOV-14 14.41.09.000000000 | 1 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 20-NOV-14 14.41.47.000000000 | 0 | 1 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 26-OCT-14 16.14.30.000000000 | 2 | 0 |
| 93a03e36-83a0-494b-bd68-495f54f406ca | 11-OCT-14 16.31.11.000000000 | 0 | 2 |
In this customer data table I store all of the dates when a given customer has placed an order, or viewed a product. Now, for a report, I want to write a query where for each customer (auth_token), I want to generate the last_order_date (row where orders > 0) and last_view_date (row where product_views > 0).
I am looking for an efficient query as I have millions of records.
select customer_token,
max(case when orders > 0 then created_date else NULL end),
max(case when views > 0 then created_date else NULL end)
from Customer
group by customer_token;
Update: This query is quite efficient because Oracle is likely to scan the table only once. Also there is an interesting thing with grouping - when you use GROUP BY a select list can only contain columns which are in the GROUP BY or aggregate functions. In this query MAX is calculated for the column created_date, but you don't need to put orders and views in a GROUP BY because they are in the expression inside MAX function. It's not very common.
When you want to get the largest value from a row, you need to use the MAX() aggregate function. It is also best practice to group a column when you are using aggregate functions.
In this case, you want to group by customer_token. That way, you'll receive one row per group, and the aggregate function will give you the value for that group.
However, you only want to see the dates where the cell value is greater than 0, so I recommend you put a case statement inside your MAX() function like this:
SELECT customer_token,
MAX(CASE WHEN orders > 0 THEN created_date ELSE NULL END) AS latestOrderDate,
MAX(CASE WHEN views > 0 THEN created_date ELSE NULL END) AS latestViewDate
FROM customer
GROUP BY customer_token;
This will give you the max date only when orders is positive, and only when views is positive. Without that case statement, the DBMS won't know which groups to give you, and you would likely get incorrect results.
Here is an oracle reference for aggregate functions.
Related
I have a database with a lot of columns with pass, fail, blank indicators
I want to create a function to count each type of value and create a table from the counts. The structure I am thinking is something like
| Value | x | y | z |
|-------|------------------|-------------------|---|---|---|---|---|---|---|
| pass | count if x=pass | count if y=pass | count if z=pass | | | | | | |
| fail | count if x=fail | count if y=fail |count if z=fail | | | | | | |
| blank | count if x=blank | count if y=blank | count if z=blank | | | | | | |
| total | count(x) | count(y) | count (z) | | | | | | |
where x,y,z are columns from another table.
I don't know which could be the best approach for this
thank you all in advance
I tried this structure but it shows syntax error
CREATE FUNCTION Countif (columnx nvarchar(20),value_compare nvarchar(10))
RETURNS Count_column_x AS
BEGIN
IF columnx=value_compare
count(columnx)
END
RETURN
END
Also, I don't know how to add each count to the actual table I am trying to create
Conditional counting (or any conditional aggregation) can often be done inline by placing a CASE expression inside the aggregate function that conditionally returns the value to be aggregated or a NULL to skip.
An example would be COUNT(CASE WHEN SelectMe = 1 THEN 1 END). Here the aggregated value is 1 (which could be any non-null value for COUNT(). (For other aggregate functions, a more meaningful value would be provided.) The implicit ELSE returns a NULL which is not counted.
For you problem, I believe the first thing to do is to UNPIVOT your data, placing the column name and values side-by-side. You can then group by value and use conditional aggregation as described above to calculate your results. After a few more details to add (1) a totals row using WITH ROLLUP, (2) a CASE statement to adjust the labels for the blank and total rows, and (3) some ORDER BY tricks to get the results right and we are done.
The results may be something like:
SELECT
CASE
WHEN GROUPING(U.Value) = 1 THEN 'Total'
WHEN U.Value = '' THEN 'Blank'
ELSE U.Value
END AS Value,
COUNT(CASE WHEN U.Col = 'x' THEN 1 END) AS x,
COUNT(CASE WHEN U.Col = 'y' THEN 1 END) AS y
FROM #Data D
UNPIVOT (
Value
FOR Col IN (x, y)
) AS U
GROUP BY U.Value WITH ROLLUP
ORDER BY
GROUPING(U.Value),
CASE U.Value WHEN 'Pass' THEN 1 WHEN 'Fail' THEN 2 WHEN '' THEN 3 ELSE 4 END,
U.VALUE
Sample data:
x
y
Pass
Pass
Pass
Fail
Pass
Fail
Sample results:
Value
x
y
Pass
3
1
Fail
1
1
Blank
0
2
Total
4
4
See this db<>fiddle for a working example.
I think you don't need a generic solution like a function with value as parameter.
Perhaps, you could create a view grouping your data and after call this view filtering by your value.
Your view body would be something like that
select value, count(*) as Total
from table_name
group by value
Feel free to explain your situation better so I could help you.
You can do this by grouping by the status column.
select status, count(*) as total
from some_table
group by status
Rather than making a whole new table, consider using a view. This is a query that looks like a table.
create view status_counts as
select status, count(*) as total
from some_table
group by status
You can then select total from status_counts where status = 'pass' or the like and it will run the query.
You can also create a "materialized view". This is like a view, but the results are written to a real table. SQL Server is special in that it will keep this table up to date for you.
create materialized view status_counts with distribution(hash(status))
select status, count(*) as total
from some_table
group by status
You'd do this for performance reasons on a large table which does not update very often.
My table looks similar to this:
| date_of_register | account_type1 | account_type2 |
| 18/11/02 23:56:59 | type_a | type_b |
I want to count registrations of different types of users per day. account_type1 can be type_a or null, account_type2 can be type_b or null.
the result should look for one example day like this:
DATE | registers type_a | registers type_b|
18/11/02 | 32 | 21 |
But I want to make this for two months.
I'm not sure how to count records from different columns and get result like this. Is it possible?
You want to count occurrences per day. The day is the truncated date. So aggregate by TRUNC(<datecolumn>). Counting is easy, as you merely want to count non-null occurrences, which COUNT(<column>) is made for.
select
trunc(date_of_register),
count(account_type1) as registers_type_a,
count(account_type2) as registers_type_b
from mytable
group by trunc(date_of_register)
order by trunc(date_of_register);
You could do:
SELECT
TRUNC(date_of_register),
COUNT(CASE WHEN account_type1 = 'type_a' THEN 1 ELSE 0 END) AS "registers type_a",
COUNT(CASE WHEN account_type2 = 'type_b' THEN 1 ELSE 0 END) AS "registers type_b"
FROM registrations_table
GROUP BY TRUNC(date_of_register)
looking for a quick solution on SQL...
I used to have a clunky formula in excel: =IF(COUNTIF($C$2:C2,C2)>1,0,COUNTIF($C$2:C2,C2)) to print 1 for unique item and 0 for a repeat.
Then moved to =1-(C1-C2) and that kinda did the job... Not an accurate one Now looking for an SQL that could do a similar job... The example below for result needed:
NUMBER UNIQUE
6573455300000 1
6573455300000 0
6573455300000 0
6573455300000 0
6573411981080 1
6573411981080 0
6573411981080 0
6573411981080 0
Does anyone know any kind of code to achieve this?
using row_number():
select
col
, [first] = case when row_number() over (partition by col order by (select 1)) > 1 then 0 else 1 end
from t
rextester demo: http://rextester.com/FWA89661
returns:
+---------------+-------+
| col | first |
+---------------+-------+
| 6573411981080 | 1 |
| 6573411981080 | 0 |
| 6573411981080 | 0 |
| 6573411981080 | 0 |
| 6573455300000 | 1 |
| 6573455300000 | 0 |
| 6573455300000 | 0 |
| 6573455300000 | 0 |
+---------------+-------+
Use window functions. In your case, you seem to want the first row and mark that, so row_number() looks like the solution:
select t.*,
(case when row_number() over (partition by number order by ?) = 1
then 1 else 0 end
end) as flag
from t;
The ? is for the column that specifies the ordering (which is first). If you want just one row but don't care which, then you can use order by number or order by (select null).
UNIQUE is a SQL keyword (think "unique index"), so it is a bad name for a column. That is why I changed to the generic flag, although you might prefer first_row_flag or something like that.
SELECT
[number],
case when rown = 1 then 1 else 0 end as [unique]
FROM
(
SELECT
[number], row_number() OVER(partition by [number] order by [number]) as rown
FROM
t
) a
This doesn't strictly have to be done using a subquery but it's unlikely to make any difference to the overall performance, so it's arranged like this to help you see what is going on. If you run just the inner subquery in isolation you'll see that the most important work is done by row_number; essentially the data is partitioned into buckets based on the value of [number] something like a group by, but it doesn't suppress repeated values. Within the partition each occurrence of [number] is numbered with an incrementing counter. When a different value of [number] is encountered the numbering restarts from 1. The order by clause is just there because sql server demands you have one, and we don't know anything else about your table but if there's something else about your data where one of these occurrences would be more ideal to single out to be labelled with [unique]=1, try and find a way to make it so that row is sorted into position 1; a typical use of this pattern is "latest record" in which case the order by part would be [datecolumn] DESC
Once you have an increment of counter per number that resets itself, all we need to do is use a standard case / else statement to make it a 1 when it's 1 otherwise 0 to match your result desired
select t.Number,case when t.num=1 then t.num else 0 end [Unique] from(
select Number,row_number() over (partition by number order by number) num from MyTbl)t
order by t.Number
I have a complicated stored procedure that calculates a column with numeric values and returns it as a part of data-set containing other columns as well. I am trying to find a way to return in the same query the SUM of that special column as well. I use SQL Management Studio and was thinking to use an OUT parameter or even a RETURN value. But if there is a more SQL-ish way to do it will definitely prefer it.
SELECT
OrID, QN, PRID, PCKID, Person, Price, CSID,
CASE
WHEN (COUNT(*) OVER (PARTITION BY OrID)) > 1
THEN Price * 0.2
ELSE Price * 0.1
END AS Commission
FROM
( < my subquery > )
I would also like to add SUM(Commission) to the the results of the above statement.
If my data is (partial)
OrID|Price
----+-----
1 | 100
2 | 100
2 | 50
3 | 80
I will get the following result
OrID|Price|Commission
----+-----+----------
1 | 100 | 10
2 | 100 | 20
2 | 50 | 10
3 | 80 | 8
And somewhere I would also like to see the SUM of the last column - 48
Something like Excel's SUM function at the end of the Commission column
You can use a subquery:
SELECT s.*, SUM(Commission) OVER (PARTITION BY OrId) as sum_commission
FROM (SELECT OrID, QN, PRID, PCKID, Person, Price, CSID
(CASE WHEN (count(*) OVER (PARTITION BY OrID)) > 1
THEN Price*0.2
ELSE Price*0.1
END) AS Commission
FROM (< my subquery >
) s
) s;
I assume you want it by OrId. If not remove the partition by.
Try using the with Rollup command. It does what you want
https://technet.microsoft.com/en-us/library/ms189305%28v=sql.90%29.aspx?f=255&MSPPError=-2147217396
is it possible in sql server to detect if 2 cells are the same, for example
ID | Quantity |SerialNo | QuantityRemaining
1 | 1 | 1234 | 0
2 | -1 | 1234 | 0
and then based on the Serial matching, ammend the overall quantity for that field to 0 in this case as ive typed above? or a more efficient way maybe? or is it better to simply update the total quantity field within a view I have which calculates the total based upon a product code?
You can use an aggregate function:
SELECT SerialNo,
SUM(Quantiy) AS QuantityRemaining
FROM YourTable
GROUP BY SerialNo
ORDER BY SerialNo