SQL: identify if there are multiples (not duplicates) in a column

SQL: identify if there are multiples (not duplicates) in a column - sql

I am currently struggling in identifying a possibility to identify certain patterns in my data using SSMS.
I wish to identify rows that contain multiples (x2, x3, or x*4) of an entry within the same column.
I really have no clue on how to even start my "where" statement right now.
SELECT [numbers], [product_ID]
FROM [db].[dbo].[tablename]
WHERE [numbers] = numbers*2
My problem is that with the code above I can obviously only identify zeros.
Google only helps me out with finding duplicates but I can't find a way to identify multiples of a value...
My desired result would be a table that only contains numbers (linked to product_IDs) that are multiples of each other
Anyone can help me out here?

If a column contains multiples, then all are multiples of the smallest non-zero value. Let me assume the values are positive or zero for this purpose.
So, you can determine if this is the case using window functions and modulo arithmetic:
select t.*
from (select t.*,
min(case when number > 0 then number end) over () as min_number
from t
) t
where number % min_number = 0 or min_number = 1;
If you want to know if all numbers meet this criteria, use aggregation:
select (case when min(number % min_number) = 0 then 'all multiples' else 'oops' end)
from (select t.*,
min(case when number > 0 then number end) over () as min_number
from t
) t

My desired result would be a table that only contains numbers (linked to product_IDs) that are multiples of each other
You'll need to test all pairs of rows, which means a CROSS JOIN.
Something like this:
with q as
(
SELECT [numbers],
[product_ID],
cast(a.numbers as float) / coalesce(b.numbers, null) ratio
FROM [tablename] a
CROSS JOIN [tablename] b
)
select *
from q
where ratio = cast(ratio as bigint)
and ratio > 1

Related

Query - display zero where null in one column and select sum of two columns where not null in next column

I need to display a zero where "Silo Wt" is null, and display the sum of the two values in the Total column even if "Silo Wt" is null.. may not require any changes if I can get a zero in the Silo column
SELECT DISTINCT (coffee_type) AS "Coffee_Type",
(SELECT ItemName
FROM [T01_Item_Name_TBL]
WHERE Item = B.Coffee_Type) AS "Description",
(SELECT COUNT(Green_Inventory_ID)
FROM [Green_Inventory] AS A
WHERE A.Coffee_Type = B.Coffee_Type
AND current_Quantity > 0) AS "Current Units",
SUM((Unit_Weight) * (Current_Quantity)) AS "Green Inv Wt",
(SELECT SUM(TGWeight)
FROM [P04_Green_STotal_TBL] AS C
WHERE TGItem = Coffee_type) AS "Silo Wt",
(SUM((Unit_Weight) * (Current_Quantity)) +
(SELECT SUM(TGWeight)
FROM [P04_Green_STotal_TBL] AS C
WHERE TGItem = Coffee_type)) AS Total
FROM
[Green_Inventory] AS B
WHERE
Pallet_Status = 0
GROUP BY
Coffee_Type
SS of query results now

You just need to wrap them in ISNULL.
However, your query could do with some serious cleanup and simplification:
DISTINCT makes no sense as you are grouping by that column anyway.
Two of the subqueries can be combined using OUTER APPLY, although this requires moving the grouped Green_Inventory into a derived table.
Another subquery, the self-join on Green_Inventory, can be transformed into conditional aggregation.
Not sure whether I've got the logic right, as the subquery did not have a filter on Pallet_Status, but it looks like you would also need to move that condition into conditional aggregation for the SUM, and use a HAVING. It depends exactly on your requirements.
Don't use quoted table or column names unless you have to.
Use meaningful table aliases, rather than A B C.
Specify table names when referencing columns, especially when using subqueries, or you might get unintended results.
SELECT
gi.Coffee_Type,
(SELECT ItemName
FROM T01_Item_Name_TBL AS n
WHERE n.Item = gi.coffee_Type
) AS Description,
ISNULL(gst.TGWeight, 0) AS SiloWt,
ISNULL(gi.GreenInvWt, 0) + ISNULL(gst.TGWeight, 0) AS Total
FROM (
SELECT
gi.Coffee_Type,
COUNT(CASE WHEN gi.current_Quantity > 0 THEN 1 END) AS CurrentUnits,
SUM(CASE WHEN gi.Pallet_Status = 0 THEN gi.Unit_Weight * gi.Current_Quantity END) AS GreenInvWt
FROM
Green_Inventory AS gi
GROUP BY
gi.Coffee_Type
HAVING
SUM(CASE WHEN gi.Pallet_Status = 0 THEN gi.Unit_Weight * gi.Current_Quantity END) > 0
) AS gi
OUTER APPLY (
SELECT SUM(gst.TGWeight) AS TGWeight
FROM P04_Green_STotal_TBL AS gst
WHERE gst.TGItem = gi.Coffee_Type
) AS gst;

How do I count distinct to exclude a value?

Below is the different scales in a POS system. I am trying to count the number of distinct scales that are not 'MANUAL WT'.
This is what I have, but it is returning 2 and not 6.
count (distinct (case when d.SCALE_IN_ID != 'MANUAL WT' then 1 else 0 end)) as Num_Scale

Consider:
select count(distinct case when scale_in_id <> 'MANUAL WT' then scale_in_id end) cnt
from mytable
The problem with your original query is that the case expression turns values to either 0 and 1, and then the aggregate function computes how many distinct values are returned: since values are all 0s or 1s, there are only two distinct values (or one in edge cases): hence the result that you are getting.

A simple WHERE clause will do:
select count(distinct scale_in_id) Num_Scale
from tablename
where scale_in_id <> 'MANUAL WT'

Create a new table with columns with case statements and max function

I have some problems in creating a new table from an old one with new columns defined by case statements.
I need to add to a new table three columns, where I compute the maximum based on different conditions. Specifically,
if time is between 1 and 3, I define a variable max_var_1_3 as max((-1)*var),
if time is between 1 and 6, I define a variable max_var_1_6 as max((-1)*var),
if time is between 1 and 12, I define a variable max_var_1_12 as max((-1)*var),
The max function needs to take the maximum value of the variable var in the window between 1 and 3, 1 and 6, 1 and 12 respectively.
I wrote this
create table new as(
select t1.*,
(case when time between 1 and 3 then MAX((-1)*var)
else var
end) as max_var_1_3,
(case when time between 1 and 6 then MAX((-1)*var)
else var
end) as max_var_1_6,
(case when time between 1 and 12 then MAX((-1)*var)
else var
end) as max_var_1_12
from old_table t1
group by time
) with data primary index time
but unfortunately it is not working. The old_table has already some columns, and I would like to import all of them and then compare the old table with the new one. I got an error that says that should be something between ) and ',', but I cannot understand what. I am using Teradata SQL.
Could you please help me?
Many thanks

The problem is that you have GROUP BY time in your query while trying to return all the other values with your SELECT t1.*. To make your query work as-is, you'd need to add each column from t1.* to your GROUP BY clause.
If you want to find the MAX value within the different time ranges AND also return all the rows, then you can use a window function. Something like this:
CREATE TABLE new AS (
SELECT
t1.*,
CASE
WHEN t1.time BETWEEN 1 AND 3 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 3 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_3,
CASE
WHEN t1.time BETWEEN 1 AND 6 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 6 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_6,
CASE
WHEN t1.time BETWEEN 1 AND 12 THEN (
MAX(CASE WHEN t1.time BETWEEN 1 AND 12 THEN (-1 * t1.var) ELSE NULL END) OVER()
)
ELSE t1.var
END AS max_var_1_12,
FROM old_table t1
) WITH DATA PRIMARY INDEX (time)
;
Here's the logic:
check if a row falls in the range
if it does, return the desired MAX value for rows in that range
otherwise, just return that given row's default value (var)
return all rows along with the three new columns
If you have performance issues, you could also move the max_var calculations to a CTE, since they only need to be calculated once. Also to avoid confusion, you may want to explicitly specify the values in your SELECT instead of using t1.*.
I don't have a TD system to test, but try it out and see if that works.

I cannot help with the CREATE TABLE AS, but the query you want is this:
SELECT
t.*,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 3) AS max_var_1_3,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 6) AS max_var_1_6,
(SELECT MAX(-1 * var) FROM old_table WHERE time BETWEEN 1 AND 12) AS max_var_1_12
FROM old_table t;

Get ratio between the length of a table and one of its subsets via SQL

I have a table named A that contains a column named x. What I'm trying to do is to count the number of items that belong to a certain subset of A (more precisely, the ones that satisfy the x > 4 condition) via a single SELECT query, for example:
SELECT COUNT(*)
FROM A
WHERE x > 4;
From thereon, I'd like to calculate the ratio between the size of this particular subset of A and A as a whole, i.e. perform the following division:
size_subset / size_A
My question is - how would I combine all of these pieces into a single SQL SELECT query?

My server is down, not able to get sure of the answer below:
SELECT count(case when x > 4 then x else null end) / COUNT(*) FROM A;
Is a slight better because its just a count, not a sum (nulls ill not be accounted)
but i prefer to do:
select (SELECT count(*) FROM A where x > 4)/(SELECT count(*) FROM A);
As I guess it can do faster

You want conditional aggregation:
SELECT sum(case when x > 4 then 1 else 0 end) / COUNT(*)
FROM A;

There's probably a less clunky way of doing this, but:
SELECT SUM(CASE WHEN x > 4 THEN 1 ELSE 0 END) / COUNT(*) FROM A

sql query: subtract from results the corresponding USD, YEN value which has Type='r'

I need help with a query. Consider the following table:
I need to select first the sum of each Code from table. I am doing it with simple sum and group by statement. Then I have to subtract the results from each code sum where type='r'
1) Say for first part of query, we will get 2 rows from SUM (one with total USD and one with total YEN)
2) Now I need to subtract from these results the corresponding USD, YEN value which has Type='r'
I have to do it inside SQL and not a stored procedure.

Why not use a WHERE statement to say WHERE Type != 'r' so that those values never even get added to sum in the first place...
SELECT `Code`, SUM(`Amount`) AS `Total`
FROM `Table`
WHERE `Type` != 'r'
GROUP
BY `Code`;
Something like that.

select code, l.amount - r.amount
from
(select code, sum(amount) as amount from my_table group by code) l
left join (select code, sum(amount) as amount from my_table where type = 'r' group by code) r
on l.code = r.code

You can do this in a single, simple query:
select
code,
sum(case when type = 'r' then (-1 * amount) else amount end) as sum
from
yourtable
group by
code
Basically, you're changing the sign of the rows that have type = 'r', so when you sum all rows for a particular code you'll get the correct answer.

Does it have to be a single query?
I'd say SUM the total, then SUM the subcategory where Type='r', then subtract one from the other.
You could do this in one line of SQL, but I'm pretty sure it would be either joining the table with itself or using a subquery. Either way, it's doing the same amount of work as the above.

Try:
select code,
sum(amount) gross_total,
sum(case when type = 'r' then amount else 0 end) type_r_total,
sum(case when type != 'r' then amount else 0 end) net_total
from yourtable
group by code;
to see the overall totals, type R only totals and non-type R totals for each currency on one row per currency, in a single pass.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: identify if there are multiples (not duplicates) in a column - sql

Related

Query - display zero where null in one column and select sum of two columns where not null in next column

How do I count distinct to exclude a value?

Create a new table with columns with case statements and max function

Get ratio between the length of a table and one of its subsets via SQL

sql query: subtract from results the corresponding USD, YEN value which has Type='r'

Categories

Resources